Linux RAID subsystem development
 help / color / mirror / Atom feed
* RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
@ 2023-05-17 13:26 raid
  2023-05-17 23:45 ` Wol
  0 siblings, 1 reply; 9+ messages in thread
From: raid @ 2023-05-17 13:26 UTC (permalink / raw)
  To: linux-raid

RAID5 Phantom Drive Appeared while Reshaping Four Drive Array
(HARDLOCK)

I've been struggling with this for about two weeks now, realizing that
I need some expert help.

My original 18 month old RAID5 consists of three newer TOSHIBA drives.
/dev/sdc :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
bytes)
/dev/sdd :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
bytes)
/dev/sde :: TOSHIBA MG08ACA16TE (4002) :: 16 TB (16,000,900,661,248
bytes)

Recently added...
/dev/sdf :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
bytes)

In a nutshell, I've added a fourth drive to my RAID5 and executed --
grow & mdadm estimated completion in 3-5 days.
At about 30-50% of reshaping, the computer hard locked. Pushing the
reset button was the agonizing requirement.

After first reboot mdadm assembled & continued. But it displayed a
fifth physical disk.
The phantom FIFTH drive appeared as failed, while the other four
continued reshaping, temporarily.
The reshaping speed dropped to 0 after another day or so. It was near
80%, I think.
So, I used mdadm -S then mdadm --assemble --scan it couldn't start
(because phantom drive?) not enough
drives to start the array. The Array State on each member shows the
fifth drive with varying status.

File system (ext4) appears damaged and won't mount. Unrecognized
filesystem.
20TB are backed up, there are, however, about 7000 newly scanned
documents that aren't.
I've done a cursory examination of data using R-Linux. Abit of in depth
peeking using Active Disk Editor.

Life goes on. I've researched and read way more than I ever thought I
would about mdadm RAID.
Not any closer on how to proceed. I'm a hardware technician with some
software skills. I'm stumped.
Also trying to be cautious not to damage whats left of the RAID. ANY
help with what commands
I can attempt to at least get the RAID to assemble WITHOUT the phantom
fifth drive would be
immensely appreciated.

All four drives now appear as spares.

---
watch -c -d -n 1 cat /proc/mdstat
md480 : inactive sdc1[0](S) sdd1[1](S) sdf1[4](S) sde1[3](S)
      62502985709 blocks super 1.2
---
uname -a
Linux OAK2023 4.19.0-24-amd64 #1 SMP Debian 4.19.282-1 (2023-04-29)
x86_64 GNU/Linux
---
mdadm --version
mdadm - v4.1 - 2018-10-01
---
mdadm -E /dev/sd[c-f]1 
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
           Name : GRANDSLAM:480
  Creation Time : Tue Oct 26 14:06:53 2021
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
     Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
  Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
    Data Offset : 264192 sectors
     New Offset : 261120 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 8f0835db:3ea24540:2ab4232d:6203d1b7

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
  Delta Devices : 1 (4->5)

    Update Time : Thu May  4 14:39:03 2023
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 37ac3c04 - correct
         Events : 78714

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
replacing)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
           Name : GRANDSLAM:480
  Creation Time : Tue Oct 26 14:06:53 2021
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
     Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
  Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
    Data Offset : 264192 sectors
     New Offset : 261120 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : b4660f49:867b9f1e:ecad0ace:c7119c37

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
  Delta Devices : 1 (4->5)

    Update Time : Thu May  4 14:39:03 2023
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : a4927b98 - correct
         Events : 78714

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
replacing)
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
           Name : GRANDSLAM:480
  Creation Time : Tue Oct 26 14:06:53 2021
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
     Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
  Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
    Data Offset : 264192 sectors
     New Offset : 261120 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 79a3dff4:c53f9071:f9c1c262:403fbc10

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
  Delta Devices : 1 (4->5)

    Update Time : Thu May  4 14:38:38 2023
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 112fbe09 - correct
         Events : 78712

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA. ('A' == active, '.' == missing, 'R' ==
replacing)
/dev/sdf1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x45
     Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
           Name : GRANDSLAM:480
  Creation Time : Tue Oct 26 14:06:53 2021
     Raid Level : raid5
   Raid Devices : 5

 Avail Dev Size : 31251492926 (14901.87 GiB 16000.76 GB)
     Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
  Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
    Data Offset : 264192 sectors
     New Offset : 261120 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 9d9c1c0d:030844a7:f365ace6:5e568930

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
  Delta Devices : 1 (4->5)

    Update Time : Thu May  4 14:39:03 2023
  Bad Block Log : 512 entries available at offset 72 sectors
       Checksum : 2d33aff - correct
         Events : 78714

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
replacing)
---
mdadm -E /dev/sd[c-f]1 | grep -E '^/dev/sd|Update'
/dev/sdc1:
    Update Time : Thu May  4 14:39:03 2023
/dev/sdd1:
    Update Time : Thu May  4 14:39:03 2023
/dev/sde1:
    Update Time : Thu May  4 14:38:38 2023
/dev/sdf1:
    Update Time : Thu May  4 14:39:03 2023
---
mdadm --assemble --scan 
mdadm: /dev/md/GRANDSLAM:480 assembled from 3 drives - not enough to
start the array.
---
/etc/mdadm/mdadm.conf
# This configuration was auto-generated on Tue, 26 Oct 2021 12:52:33
-0500 by mkconf
ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480
UUID=20211025:02005a7a:5a7abeef:cafebabe
---

NOTE: Raid Level is now shown below to be raid0. This is a RAID5.
      Delta Devices are munged?

NOW;mdadm -D /dev/md480
 2023.05.17 02:44:06 AM 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 4
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 4

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice

       -       8       81        -        /dev/sdf1
       -       8       65        -        /dev/sde1
       -       8       49        -        /dev/sdd1
       -       8       33        -        /dev/sdc1
---

NOTE: The HITACHI MG08ACA16TE drives default to DISABLED
      I've since enabled the setting if this helps.

smartctl -l scterc /dev/sdc; smartctl -l scterc /dev/sdd; smartctl -l
scterc /dev/sde; smartctl -l scterc /dev/sdf

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

---

Exhausted and maybe I'm just looking for someone to suggest running the
command that I really don't want to run yet.

Enabling Loss Of Confusion flag hasn't worked either.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
  2023-05-17 13:26 RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK) raid
@ 2023-05-17 23:45 ` Wol
  2023-05-18  3:15   ` Yu Kuai
  0 siblings, 1 reply; 9+ messages in thread
From: Wol @ 2023-05-17 23:45 UTC (permalink / raw)
  To: raid, linux-raid; +Cc: Yu Kuai, Phil Turmel, NeilBrown

Hmmm. Firstly, what command did you give to grow the array?

Secondly, take a look at the thread "Raid5 to raid6 grow interrupted, 
mdadm hangs on assemble command". There's a problem there with rebuilds 
locking up, which is not fatal, and will be fixed, but might not have 
rippled through yet ...

That raid0 thing is almost certainly nothing to be worried about - it 
seems to be normal for any array that doesn't assemble completely.

The only things that bother me slightly are I believe mdadm 4.2 has been 
released? Don't quote me on that. And scterc is disabled by default? Weird.

I've cc'd a few people who I hope can help further ...

Cheers,
Wol

On 17/05/2023 14:26, raid wrote:
> RAID5 Phantom Drive Appeared while Reshaping Four Drive Array
> (HARDLOCK)
> 
> I've been struggling with this for about two weeks now, realizing that
> I need some expert help.
> 
> My original 18 month old RAID5 consists of three newer TOSHIBA drives.
> /dev/sdc :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
> bytes)
> /dev/sdd :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
> bytes)
> /dev/sde :: TOSHIBA MG08ACA16TE (4002) :: 16 TB (16,000,900,661,248
> bytes)
> 
> Recently added...
> /dev/sdf :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
> bytes)
> 
> In a nutshell, I've added a fourth drive to my RAID5 and executed --
> grow & mdadm estimated completion in 3-5 days.
> At about 30-50% of reshaping, the computer hard locked. Pushing the
> reset button was the agonizing requirement.
> 
> After first reboot mdadm assembled & continued. But it displayed a
> fifth physical disk.
> The phantom FIFTH drive appeared as failed, while the other four
> continued reshaping, temporarily.
> The reshaping speed dropped to 0 after another day or so. It was near
> 80%, I think.
> So, I used mdadm -S then mdadm --assemble --scan it couldn't start
> (because phantom drive?) not enough
> drives to start the array. The Array State on each member shows the
> fifth drive with varying status.
> 
> File system (ext4) appears damaged and won't mount. Unrecognized
> filesystem.
> 20TB are backed up, there are, however, about 7000 newly scanned
> documents that aren't.
> I've done a cursory examination of data using R-Linux. Abit of in depth
> peeking using Active Disk Editor.
> 
> Life goes on. I've researched and read way more than I ever thought I
> would about mdadm RAID.
> Not any closer on how to proceed. I'm a hardware technician with some
> software skills. I'm stumped.
> Also trying to be cautious not to damage whats left of the RAID. ANY
> help with what commands
> I can attempt to at least get the RAID to assemble WITHOUT the phantom
> fifth drive would be
> immensely appreciated.
> 
> All four drives now appear as spares.
> 
> ---
> watch -c -d -n 1 cat /proc/mdstat
> md480 : inactive sdc1[0](S) sdd1[1](S) sdf1[4](S) sde1[3](S)
>        62502985709 blocks super 1.2
> ---
> uname -a
> Linux OAK2023 4.19.0-24-amd64 #1 SMP Debian 4.19.282-1 (2023-04-29)
> x86_64 GNU/Linux
> ---
> mdadm --version
> mdadm - v4.1 - 2018-10-01
> ---
> mdadm -E /dev/sd[c-f]1
> /dev/sdc1:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x45
>       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
>             Name : GRANDSLAM:480
>    Creation Time : Tue Oct 26 14:06:53 2021
>       Raid Level : raid5
>     Raid Devices : 5
> 
>   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
>       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
>    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
>      Data Offset : 264192 sectors
>       New Offset : 261120 sectors
>     Super Offset : 8 sectors
>            State : clean
>      Device UUID : 8f0835db:3ea24540:2ab4232d:6203d1b7
> 
> Internal Bitmap : 8 sectors from superblock
>    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
>    Delta Devices : 1 (4->5)
> 
>      Update Time : Thu May  4 14:39:03 2023
>    Bad Block Log : 512 entries available at offset 72 sectors
>         Checksum : 37ac3c04 - correct
>           Events : 78714
> 
>           Layout : left-symmetric
>       Chunk Size : 512K
> 
>     Device Role : Active device 0
>     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
> replacing)
> /dev/sdd1:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x45
>       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
>             Name : GRANDSLAM:480
>    Creation Time : Tue Oct 26 14:06:53 2021
>       Raid Level : raid5
>     Raid Devices : 5
> 
>   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
>       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
>    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
>      Data Offset : 264192 sectors
>       New Offset : 261120 sectors
>     Super Offset : 8 sectors
>            State : clean
>      Device UUID : b4660f49:867b9f1e:ecad0ace:c7119c37
> 
> Internal Bitmap : 8 sectors from superblock
>    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
>    Delta Devices : 1 (4->5)
> 
>      Update Time : Thu May  4 14:39:03 2023
>    Bad Block Log : 512 entries available at offset 72 sectors
>         Checksum : a4927b98 - correct
>           Events : 78714
> 
>           Layout : left-symmetric
>       Chunk Size : 512K
> 
>     Device Role : Active device 1
>     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
> replacing)
> /dev/sde1:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x45
>       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
>             Name : GRANDSLAM:480
>    Creation Time : Tue Oct 26 14:06:53 2021
>       Raid Level : raid5
>     Raid Devices : 5
> 
>   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
>       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
>    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
>      Data Offset : 264192 sectors
>       New Offset : 261120 sectors
>     Super Offset : 8 sectors
>            State : clean
>      Device UUID : 79a3dff4:c53f9071:f9c1c262:403fbc10
> 
> Internal Bitmap : 8 sectors from superblock
>    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
>    Delta Devices : 1 (4->5)
> 
>      Update Time : Thu May  4 14:38:38 2023
>    Bad Block Log : 512 entries available at offset 72 sectors
>         Checksum : 112fbe09 - correct
>           Events : 78712
> 
>           Layout : left-symmetric
>       Chunk Size : 512K
> 
>     Device Role : Active device 2
>     Array State : AAAA. ('A' == active, '.' == missing, 'R' ==
> replacing)
> /dev/sdf1:
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x45
>       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
>             Name : GRANDSLAM:480
>    Creation Time : Tue Oct 26 14:06:53 2021
>       Raid Level : raid5
>     Raid Devices : 5
> 
>   Avail Dev Size : 31251492926 (14901.87 GiB 16000.76 GB)
>       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
>    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
>      Data Offset : 264192 sectors
>       New Offset : 261120 sectors
>     Super Offset : 8 sectors
>            State : clean
>      Device UUID : 9d9c1c0d:030844a7:f365ace6:5e568930
> 
> Internal Bitmap : 8 sectors from superblock
>    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
>    Delta Devices : 1 (4->5)
> 
>      Update Time : Thu May  4 14:39:03 2023
>    Bad Block Log : 512 entries available at offset 72 sectors
>         Checksum : 2d33aff - correct
>           Events : 78714
> 
>           Layout : left-symmetric
>       Chunk Size : 512K
> 
>     Device Role : Active device 3
>     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
> replacing)
> ---
> mdadm -E /dev/sd[c-f]1 | grep -E '^/dev/sd|Update'
> /dev/sdc1:
>      Update Time : Thu May  4 14:39:03 2023
> /dev/sdd1:
>      Update Time : Thu May  4 14:39:03 2023
> /dev/sde1:
>      Update Time : Thu May  4 14:38:38 2023
> /dev/sdf1:
>      Update Time : Thu May  4 14:39:03 2023
> ---
> mdadm --assemble --scan
> mdadm: /dev/md/GRANDSLAM:480 assembled from 3 drives - not enough to
> start the array.
> ---
> /etc/mdadm/mdadm.conf
> # This configuration was auto-generated on Tue, 26 Oct 2021 12:52:33
> -0500 by mkconf
> ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480
> UUID=20211025:02005a7a:5a7abeef:cafebabe
> ---
> 
> NOTE: Raid Level is now shown below to be raid0. This is a RAID5.
>        Delta Devices are munged?
> 
> NOW;mdadm -D /dev/md480
>   2023.05.17 02:44:06 AM
> /dev/md480:
>             Version : 1.2
>          Raid Level : raid0
>       Total Devices : 4
>         Persistence : Superblock is persistent
> 
>               State : inactive
>     Working Devices : 4
> 
>       Delta Devices : 1, (-1->0)
>           New Level : raid5
>          New Layout : left-symmetric
>       New Chunksize : 512K
> 
>                Name : GRANDSLAM:480
>                UUID : 20211025:02005a7a:5a7abeef:cafebabe
>              Events : 78714
> 
>      Number   Major   Minor   RaidDevice
> 
>         -       8       81        -        /dev/sdf1
>         -       8       65        -        /dev/sde1
>         -       8       49        -        /dev/sdd1
>         -       8       33        -        /dev/sdc1
> ---
> 
> NOTE: The HITACHI MG08ACA16TE drives default to DISABLED
>        I've since enabled the setting if this helps.
> 
> smartctl -l scterc /dev/sdc; smartctl -l scterc /dev/sdd; smartctl -l
> scterc /dev/sde; smartctl -l scterc /dev/sdf
> 
> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
> build)
> Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> www.smartmontools.org
> 
> SCT Error Recovery Control:
>             Read:     70 (7.0 seconds)
>            Write:     70 (7.0 seconds)
> 
> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
> build)
> Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> www.smartmontools.org
> 
> SCT Error Recovery Control:
>             Read:     70 (7.0 seconds)
>            Write:     70 (7.0 seconds)
> 
> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
> build)
> Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> www.smartmontools.org
> 
> SCT Error Recovery Control:
>             Read:     70 (7.0 seconds)
>            Write:     70 (7.0 seconds)
> 
> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
> build)
> Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> www.smartmontools.org
> 
> SCT Error Recovery Control:
>             Read:     70 (7.0 seconds)
>            Write:     70 (7.0 seconds)
> 
> ---
> 
> Exhausted and maybe I'm just looking for someone to suggest running the
> command that I really don't want to run yet.
> 
> Enabling Loss Of Confusion flag hasn't worked either.
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
  2023-05-17 23:45 ` Wol
@ 2023-05-18  3:15   ` Yu Kuai
  2023-05-22  6:56     ` raid
  2023-05-22  7:20     ` raid
  0 siblings, 2 replies; 9+ messages in thread
From: Yu Kuai @ 2023-05-18  3:15 UTC (permalink / raw)
  To: Wol, raid, linux-raid; +Cc: Yu Kuai, Phil Turmel, NeilBrown, yukuai (C)

Hi,

在 2023/05/18 7:45, Wol 写道:
> Hmmm. Firstly, what command did you give to grow the array?
> 
> Secondly, take a look at the thread "Raid5 to raid6 grow interrupted, 
> mdadm hangs on assemble command". There's a problem there with rebuilds 
> locking up, which is not fatal, and will be fixed, but might not have 
> rippled through yet ...
> 
> That raid0 thing is almost certainly nothing to be worried about - it 
> seems to be normal for any array that doesn't assemble completely.
> 
> The only things that bother me slightly are I believe mdadm 4.2 has been 
> released? Don't quote me on that. And scterc is disabled by default? Weird.
> 
> I've cc'd a few people who I hope can help further ...

Hi, please cc yukuai3@huawei.com for me, huaweicloud email is just for
send, I don't receive emails from this...
> 
> Cheers,
> Wol
> 
> On 17/05/2023 14:26, raid wrote:
>> RAID5 Phantom Drive Appeared while Reshaping Four Drive Array
>> (HARDLOCK)
>>
>> I've been struggling with this for about two weeks now, realizing that
>> I need some expert help.
>>
>> My original 18 month old RAID5 consists of three newer TOSHIBA drives.
>> /dev/sdc :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
>> bytes)
>> /dev/sdd :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
>> bytes)
>> /dev/sde :: TOSHIBA MG08ACA16TE (4002) :: 16 TB (16,000,900,661,248
>> bytes)
>>
>> Recently added...
>> /dev/sdf :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
>> bytes)
>>
>> In a nutshell, I've added a fourth drive to my RAID5 and executed --
>> grow & mdadm estimated completion in 3-5 days.
>> At about 30-50% of reshaping, the computer hard locked. Pushing the
>> reset button was the agonizing requirement.
>>
>> After first reboot mdadm assembled & continued. But it displayed a
>> fifth physical disk.
>> The phantom FIFTH drive appeared as failed, while the other four
>> continued reshaping, temporarily.
>> The reshaping speed dropped to 0 after another day or so. It was near
>> 80%, I think.
>> So, I used mdadm -S then mdadm --assemble --scan it couldn't start
>> (because phantom drive?) not enough
>> drives to start the array. The Array State on each member shows the
>> fifth drive with varying status.
>>
>> File system (ext4) appears damaged and won't mount. Unrecognized
>> filesystem.
>> 20TB are backed up, there are, however, about 7000 newly scanned
>> documents that aren't.
>> I've done a cursory examination of data using R-Linux. Abit of in depth
>> peeking using Active Disk Editor.
>>
>> Life goes on. I've researched and read way more than I ever thought I
>> would about mdadm RAID.
>> Not any closer on how to proceed. I'm a hardware technician with some
>> software skills. I'm stumped.
>> Also trying to be cautious not to damage whats left of the RAID. ANY
>> help with what commands
>> I can attempt to at least get the RAID to assemble WITHOUT the phantom
>> fifth drive would be
>> immensely appreciated.
>>
>> All four drives now appear as spares.
>>
>> ---
>> watch -c -d -n 1 cat /proc/mdstat
>> md480 : inactive sdc1[0](S) sdd1[1](S) sdf1[4](S) sde1[3](S)
>>        62502985709 blocks super 1.2
>> ---
>> uname -a
>> Linux OAK2023 4.19.0-24-amd64 #1 SMP Debian 4.19.282-1 (2023-04-29)
>> x86_64 GNU/Linux
>> ---
>> mdadm --version
>> mdadm - v4.1 - 2018-10-01
>> ---
>> mdadm -E /dev/sd[c-f]1
>> /dev/sdc1:
>>            Magic : a92b4efc
>>          Version : 1.2
>>      Feature Map : 0x45
>>       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
>>             Name : GRANDSLAM:480
>>    Creation Time : Tue Oct 26 14:06:53 2021
>>       Raid Level : raid5
>>     Raid Devices : 5
>>
>>   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
>>       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
>>    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
>>      Data Offset : 264192 sectors
>>       New Offset : 261120 sectors
>>     Super Offset : 8 sectors
>>            State : clean
>>      Device UUID : 8f0835db:3ea24540:2ab4232d:6203d1b7
>>
>> Internal Bitmap : 8 sectors from superblock
>>    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
>>    Delta Devices : 1 (4->5)
>>
>>      Update Time : Thu May  4 14:39:03 2023
>>    Bad Block Log : 512 entries available at offset 72 sectors
>>         Checksum : 37ac3c04 - correct
>>           Events : 78714
>>
>>           Layout : left-symmetric
>>       Chunk Size : 512K
>>
>>     Device Role : Active device 0
>>     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
>> replacing)
>> /dev/sdd1:
>>            Magic : a92b4efc
>>          Version : 1.2
>>      Feature Map : 0x45
>>       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
>>             Name : GRANDSLAM:480
>>    Creation Time : Tue Oct 26 14:06:53 2021
>>       Raid Level : raid5
>>     Raid Devices : 5
>>
>>   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
>>       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
>>    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
>>      Data Offset : 264192 sectors
>>       New Offset : 261120 sectors
>>     Super Offset : 8 sectors
>>            State : clean
>>      Device UUID : b4660f49:867b9f1e:ecad0ace:c7119c37
>>
>> Internal Bitmap : 8 sectors from superblock
>>    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
>>    Delta Devices : 1 (4->5)
>>
>>      Update Time : Thu May  4 14:39:03 2023
>>    Bad Block Log : 512 entries available at offset 72 sectors
>>         Checksum : a4927b98 - correct
>>           Events : 78714
>>
>>           Layout : left-symmetric
>>       Chunk Size : 512K
>>
>>     Device Role : Active device 1
>>     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
>> replacing)
>> /dev/sde1:
>>            Magic : a92b4efc
>>          Version : 1.2
>>      Feature Map : 0x45
>>       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
>>             Name : GRANDSLAM:480
>>    Creation Time : Tue Oct 26 14:06:53 2021
>>       Raid Level : raid5
>>     Raid Devices : 5
>>
>>   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
>>       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
>>    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
>>      Data Offset : 264192 sectors
>>       New Offset : 261120 sectors
>>     Super Offset : 8 sectors
>>            State : clean
>>      Device UUID : 79a3dff4:c53f9071:f9c1c262:403fbc10
>>
>> Internal Bitmap : 8 sectors from superblock
>>    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
>>    Delta Devices : 1 (4->5)
>>
>>      Update Time : Thu May  4 14:38:38 2023
>>    Bad Block Log : 512 entries available at offset 72 sectors
>>         Checksum : 112fbe09 - correct
>>           Events : 78712
>>
>>           Layout : left-symmetric
>>       Chunk Size : 512K
>>
>>     Device Role : Active device 2
>>     Array State : AAAA. ('A' == active, '.' == missing, 'R' ==
>> replacing)

I have no idle why other disk shows that device 2 is missing, and what
is device 4.

Anyway, can you try the following?

mdadm -I /dev/sdc1
mdadm -D /dev/mdxxx

mdadm -I /dev/sdd1
mdadm -D /dev/mdxxx

mdadm -I /dev/sde1
mdadm -D /dev/mdxxx

mdadm -I /dev/sdf1
mdadm -D /dev/mdxxx

If above works well, you can try:

mdadm -R /dev/mdxxx, and see if the array can be started.

Thanks,
Kuai
>> /dev/sdf1:
>>            Magic : a92b4efc
>>          Version : 1.2
>>      Feature Map : 0x45
>>       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
>>             Name : GRANDSLAM:480
>>    Creation Time : Tue Oct 26 14:06:53 2021
>>       Raid Level : raid5
>>     Raid Devices : 5
>>
>>   Avail Dev Size : 31251492926 (14901.87 GiB 16000.76 GB)
>>       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
>>    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
>>      Data Offset : 264192 sectors
>>       New Offset : 261120 sectors
>>     Super Offset : 8 sectors
>>            State : clean
>>      Device UUID : 9d9c1c0d:030844a7:f365ace6:5e568930
>>
>> Internal Bitmap : 8 sectors from superblock
>>    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
>>    Delta Devices : 1 (4->5)
>>
>>      Update Time : Thu May  4 14:39:03 2023
>>    Bad Block Log : 512 entries available at offset 72 sectors
>>         Checksum : 2d33aff - correct
>>           Events : 78714
>>
>>           Layout : left-symmetric
>>       Chunk Size : 512K
>>
>>     Device Role : Active device 3
>>     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
>> replacing)
>> ---
>> mdadm -E /dev/sd[c-f]1 | grep -E '^/dev/sd|Update'
>> /dev/sdc1:
>>      Update Time : Thu May  4 14:39:03 2023
>> /dev/sdd1:
>>      Update Time : Thu May  4 14:39:03 2023
>> /dev/sde1:
>>      Update Time : Thu May  4 14:38:38 2023
>> /dev/sdf1:
>>      Update Time : Thu May  4 14:39:03 2023
>> ---
>> mdadm --assemble --scan
>> mdadm: /dev/md/GRANDSLAM:480 assembled from 3 drives - not enough to
>> start the array.
>> ---
>> /etc/mdadm/mdadm.conf
>> # This configuration was auto-generated on Tue, 26 Oct 2021 12:52:33
>> -0500 by mkconf
>> ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480
>> UUID=20211025:02005a7a:5a7abeef:cafebabe
>> ---
>>
>> NOTE: Raid Level is now shown below to be raid0. This is a RAID5.
>>        Delta Devices are munged?
>>
>> NOW;mdadm -D /dev/md480
>>   2023.05.17 02:44:06 AM
>> /dev/md480:
>>             Version : 1.2
>>          Raid Level : raid0
>>       Total Devices : 4
>>         Persistence : Superblock is persistent
>>
>>               State : inactive
>>     Working Devices : 4
>>
>>       Delta Devices : 1, (-1->0)
>>           New Level : raid5
>>          New Layout : left-symmetric
>>       New Chunksize : 512K
>>
>>                Name : GRANDSLAM:480
>>                UUID : 20211025:02005a7a:5a7abeef:cafebabe
>>              Events : 78714
>>
>>      Number   Major   Minor   RaidDevice
>>
>>         -       8       81        -        /dev/sdf1
>>         -       8       65        -        /dev/sde1
>>         -       8       49        -        /dev/sdd1
>>         -       8       33        -        /dev/sdc1
>> ---
>>
>> NOTE: The HITACHI MG08ACA16TE drives default to DISABLED
>>        I've since enabled the setting if this helps.
>>
>> smartctl -l scterc /dev/sdc; smartctl -l scterc /dev/sdd; smartctl -l
>> scterc /dev/sde; smartctl -l scterc /dev/sdf
>>
>> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
>> build)
>> Copyright (C) 2002-17, Bruce Allen, Christian Franke,
>> www.smartmontools.org
>>
>> SCT Error Recovery Control:
>>             Read:     70 (7.0 seconds)
>>            Write:     70 (7.0 seconds)
>>
>> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
>> build)
>> Copyright (C) 2002-17, Bruce Allen, Christian Franke,
>> www.smartmontools.org
>>
>> SCT Error Recovery Control:
>>             Read:     70 (7.0 seconds)
>>            Write:     70 (7.0 seconds)
>>
>> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
>> build)
>> Copyright (C) 2002-17, Bruce Allen, Christian Franke,
>> www.smartmontools.org
>>
>> SCT Error Recovery Control:
>>             Read:     70 (7.0 seconds)
>>            Write:     70 (7.0 seconds)
>>
>> smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
>> build)
>> Copyright (C) 2002-17, Bruce Allen, Christian Franke,
>> www.smartmontools.org
>>
>> SCT Error Recovery Control:
>>             Read:     70 (7.0 seconds)
>>            Write:     70 (7.0 seconds)
>>
>> ---
>>
>> Exhausted and maybe I'm just looking for someone to suggest running the
>> command that I really don't want to run yet.
>>
>> Enabling Loss Of Confusion flag hasn't worked either.
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
  2023-05-18  3:15   ` Yu Kuai
@ 2023-05-22  6:56     ` raid
  2023-05-22  7:51       ` Yu Kuai
  2023-05-22  7:20     ` raid
  1 sibling, 1 reply; 9+ messages in thread
From: raid @ 2023-05-22  6:56 UTC (permalink / raw)
  To: Yu Kuai, Wol, linux-raid; +Cc: Phil Turmel, NeilBrown, yukuai (C)

Hi,
Thanks for the guidance as the current state has at least changed somewhat.

BTW Sorry about Life getting in the way of tech. =) Reason for my delayed response.

-sudo mdadm -I /dev/sdc1
mdadm: /dev/sdc1 attached to /dev/md480, not enough to start (1).
-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 1
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 1

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice

       -       8       33        -        /dev/sdc1
-sudo mdadm -I /dev/sdd1
mdadm: /dev/sdd1 attached to /dev/md480, not enough to start (2).
-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 2
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 2

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice

       -       8       49        -        /dev/sdd1
       -       8       33        -        /dev/sdc1
-sudo mdadm -I /dev/sde1
mdadm: /dev/sde1 attached to /dev/md480, not enough to start (2).
-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 3
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 3

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78712

    Number   Major   Minor   RaidDevice

       -       8       65        -        /dev/sde1
       -       8       49        -        /dev/sdd1
       -       8       33        -        /dev/sdc1
-sudo mdadm -I /dev/sdf1
mdadm: /dev/sdf1 attached to /dev/md480, not enough to start (3).
-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 4
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 4

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice

       -       8       81        -        /dev/sdf1
       -       8       65        -        /dev/sde1
       -       8       49        -        /dev/sdd1
       -       8       33        -        /dev/sdc1
-sudo mdadm -R /dev/md480 
mdadm: failed to start array /dev/md480: Input/output error
---
NOTE: Of additional interest...
---
-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
     Creation Time : Tue Oct 26 14:06:53 2021
        Raid Level : raid5
     Used Dev Size : 18446744073709551615
      Raid Devices : 5
     Total Devices : 3
       Persistence : Superblock is persistent

       Update Time : Thu May  4 14:39:03 2023
             State : active, FAILED, Not Started 
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : unknown

     Delta Devices : 1, (4->5)

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed
       -       0        0        2      removed
       -       0        0        3      removed
       -       0        0        4      removed

       -       8       81        3      sync   /dev/sdf1
       -       8       49        1      sync   /dev/sdd1
       -       8       33        0      sync   /dev/sdc1

---
-watch -c -d -n 1 cat /proc/mdstat
---
Every 1.0s: cat /proc/mdstat                                                     OAK2023: Mon May 22 01:48:24 2023

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md480 : inactive sdf1[4] sdd1[1] sdc1[0]
      46877239294 blocks super 1.2

unused devices: <none>
---
Hopeful that is some progress towards an array start? It's definately unexpected output to me.
I/O Error starting md480

Thanks!
SA

On Thu, 2023-05-18 at 11:15 +0800, Yu Kuai wrote:

> I have no idle why other disk shows that device 2 is missing, and what
> is device 4.
> 
> Anyway, can you try the following?
> 
> mdadm -I /dev/sdc1
> mdadm -D /dev/mdxxx
> 
> mdadm -I /dev/sdd1
> mdadm -D /dev/mdxxx
> 
> mdadm -I /dev/sde1
> mdadm -D /dev/mdxxx
> 
> mdadm -I /dev/sdf1
> mdadm -D /dev/mdxxx
> 
> If above works well, you can try:
> 
> mdadm -R /dev/mdxxx, and see if the array can be started.
> 
> Thanks,
> Kuai




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
  2023-05-18  3:15   ` Yu Kuai
  2023-05-22  6:56     ` raid
@ 2023-05-22  7:20     ` raid
  1 sibling, 0 replies; 9+ messages in thread
From: raid @ 2023-05-22  7:20 UTC (permalink / raw)
  To: Yu Kuai, Wol, linux-raid; +Cc: Phil Turmel, NeilBrown, yukuai (C)

Hi,
Thanks for the guidance as the current state has at least changed somewhat.

BTW Sorry about Life getting in the way of tech. =) Reason for my delayed response.

-sudo mdadm -I /dev/sdc1
mdadm: /dev/sdc1 attached to /dev/md480, not enough to start (1).

-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 1
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 1

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice

       -       8       33        -        /dev/sdc1

-sudo mdadm -I /dev/sdd1
mdadm: /dev/sdd1 attached to /dev/md480, not enough to start (2).

-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 2
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 2

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice

       -       8       49        -        /dev/sdd1
       -       8       33        -        /dev/sdc1

-sudo mdadm -I /dev/sde1
mdadm: /dev/sde1 attached to /dev/md480, not enough to start (2).

-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 3
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 3

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78712

    Number   Major   Minor   RaidDevice

       -       8       65        -        /dev/sde1
       -       8       49        -        /dev/sdd1
       -       8       33        -        /dev/sdc1

-sudo mdadm -I /dev/sdf1
mdadm: /dev/sdf1 attached to /dev/md480, not enough to start (3).

-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 4
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 4

     Delta Devices : 1, (-1->0)
         New Level : raid5
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice

       -       8       81        -        /dev/sdf1
       -       8       65        -        /dev/sde1
       -       8       49        -        /dev/sdd1
       -       8       33        -        /dev/sdc1

-sudo mdadm -R /dev/md480 
mdadm: failed to start array /dev/md480: Input/output error

---
NOTE: Of additional interest...
---

-sudo mdadm -D /dev/md480 
/dev/md480:
           Version : 1.2
     Creation Time : Tue Oct 26 14:06:53 2021
        Raid Level : raid5
     Used Dev Size : 18446744073709551615
      Raid Devices : 5
     Total Devices : 3
       Persistence : Superblock is persistent

       Update Time : Thu May  4 14:39:03 2023
             State : active, FAILED, Not Started 
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : unknown

     Delta Devices : 1, (4->5)

              Name : GRANDSLAM:480
              UUID : 20211025:02005a7a:5a7abeef:cafebabe
            Events : 78714

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed
       -       0        0        2      removed
       -       0        0        3      removed
       -       0        0        4      removed

       -       8       81        3      sync   /dev/sdf1
       -       8       49        1      sync   /dev/sdd1
       -       8       33        0      sync   /dev/sdc1

---
-watch -c -d -n 1 cat /proc/mdstat
---
Every 1.0s: cat /proc/mdstat                                                     OAK2023: Mon May 22 01:48:24 2023

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md480 : inactive sdf1[4] sdd1[1] sdc1[0]
      46877239294 blocks super 1.2

unused devices: <none>
---

Hopeful that is some progress towards an array start? It's definately unexpected output to me.
I/O Error starting md480

Thanks!
SA

On Thu, 2023-05-18 at 11:15 +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2023/05/18 7:45, Wol 写道:
> > Hmmm. Firstly, what command did you give to grow the array?
> > 
> > Secondly, take a look at the thread "Raid5 to raid6 grow interrupted, 
> > mdadm hangs on assemble command". There's a problem there with rebuilds 
> > locking up, which is not fatal, and will be fixed, but might not have 
> > rippled through yet ...
> > 
> > That raid0 thing is almost certainly nothing to be worried about - it 
> > seems to be normal for any array that doesn't assemble completely.
> > 
> > The only things that bother me slightly are I believe mdadm 4.2 has been 
> > released? Don't quote me on that. And scterc is disabled by default? Weird.
> > 
> > I've cc'd a few people who I hope can help further ...
> 
> Hi, please cc yukuai3@huawei.com for me, huaweicloud email is just for
> send, I don't receive emails from this...
> > Cheers,
> > Wol
> > 
> > On 17/05/2023 14:26, raid wrote:
> > > RAID5 Phantom Drive Appeared while Reshaping Four Drive Array
> > > (HARDLOCK)
> > > 
> > > I've been struggling with this for about two weeks now, realizing that
> > > I need some expert help.
> > > 
> > > My original 18 month old RAID5 consists of three newer TOSHIBA drives.
> > > /dev/sdc :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
> > > bytes)
> > > /dev/sdd :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
> > > bytes)
> > > /dev/sde :: TOSHIBA MG08ACA16TE (4002) :: 16 TB (16,000,900,661,248
> > > bytes)
> > > 
> > > Recently added...
> > > /dev/sdf :: TOSHIBA MG08ACA16TE (4303) :: 16 TB (16,000,900,661,248
> > > bytes)
> > > 
> > > In a nutshell, I've added a fourth drive to my RAID5 and executed --
> > > grow & mdadm estimated completion in 3-5 days.
> > > At about 30-50% of reshaping, the computer hard locked. Pushing the
> > > reset button was the agonizing requirement.
> > > 
> > > After first reboot mdadm assembled & continued. But it displayed a
> > > fifth physical disk.
> > > The phantom FIFTH drive appeared as failed, while the other four
> > > continued reshaping, temporarily.
> > > The reshaping speed dropped to 0 after another day or so. It was near
> > > 80%, I think.
> > > So, I used mdadm -S then mdadm --assemble --scan it couldn't start
> > > (because phantom drive?) not enough
> > > drives to start the array. The Array State on each member shows the
> > > fifth drive with varying status.
> > > 
> > > File system (ext4) appears damaged and won't mount. Unrecognized
> > > filesystem.
> > > 20TB are backed up, there are, however, about 7000 newly scanned
> > > documents that aren't.
> > > I've done a cursory examination of data using R-Linux. Abit of in depth
> > > peeking using Active Disk Editor.
> > > 
> > > Life goes on. I've researched and read way more than I ever thought I
> > > would about mdadm RAID.
> > > Not any closer on how to proceed. I'm a hardware technician with some
> > > software skills. I'm stumped.
> > > Also trying to be cautious not to damage whats left of the RAID. ANY
> > > help with what commands
> > > I can attempt to at least get the RAID to assemble WITHOUT the phantom
> > > fifth drive would be
> > > immensely appreciated.
> > > 
> > > All four drives now appear as spares.
> > > 
> > > ---
> > > watch -c -d -n 1 cat /proc/mdstat
> > > md480 : inactive sdc1[0](S) sdd1[1](S) sdf1[4](S) sde1[3](S)
> > >        62502985709 blocks super 1.2
> > > ---
> > > uname -a
> > > Linux OAK2023 4.19.0-24-amd64 #1 SMP Debian 4.19.282-1 (2023-04-29)
> > > x86_64 GNU/Linux
> > > ---
> > > mdadm --version
> > > mdadm - v4.1 - 2018-10-01
> > > ---
> > > mdadm -E /dev/sd[c-f]1
> > > /dev/sdc1:
> > >            Magic : a92b4efc
> > >          Version : 1.2
> > >      Feature Map : 0x45
> > >       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >             Name : GRANDSLAM:480
> > >    Creation Time : Tue Oct 26 14:06:53 2021
> > >       Raid Level : raid5
> > >     Raid Devices : 5
> > > 
> > >   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
> > >       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
> > >    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
> > >      Data Offset : 264192 sectors
> > >       New Offset : 261120 sectors
> > >     Super Offset : 8 sectors
> > >            State : clean
> > >      Device UUID : 8f0835db:3ea24540:2ab4232d:6203d1b7
> > > 
> > > Internal Bitmap : 8 sectors from superblock
> > >    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
> > >    Delta Devices : 1 (4->5)
> > > 
> > >      Update Time : Thu May  4 14:39:03 2023
> > >    Bad Block Log : 512 entries available at offset 72 sectors
> > >         Checksum : 37ac3c04 - correct
> > >           Events : 78714
> > > 
> > >           Layout : left-symmetric
> > >       Chunk Size : 512K
> > > 
> > >     Device Role : Active device 0
> > >     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
> > > replacing)
> > > /dev/sdd1:
> > >            Magic : a92b4efc
> > >          Version : 1.2
> > >      Feature Map : 0x45
> > >       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >             Name : GRANDSLAM:480
> > >    Creation Time : Tue Oct 26 14:06:53 2021
> > >       Raid Level : raid5
> > >     Raid Devices : 5
> > > 
> > >   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
> > >       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
> > >    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
> > >      Data Offset : 264192 sectors
> > >       New Offset : 261120 sectors
> > >     Super Offset : 8 sectors
> > >            State : clean
> > >      Device UUID : b4660f49:867b9f1e:ecad0ace:c7119c37
> > > 
> > > Internal Bitmap : 8 sectors from superblock
> > >    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
> > >    Delta Devices : 1 (4->5)
> > > 
> > >      Update Time : Thu May  4 14:39:03 2023
> > >    Bad Block Log : 512 entries available at offset 72 sectors
> > >         Checksum : a4927b98 - correct
> > >           Events : 78714
> > > 
> > >           Layout : left-symmetric
> > >       Chunk Size : 512K
> > > 
> > >     Device Role : Active device 1
> > >     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
> > > replacing)
> > > /dev/sde1:
> > >            Magic : a92b4efc
> > >          Version : 1.2
> > >      Feature Map : 0x45
> > >       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >             Name : GRANDSLAM:480
> > >    Creation Time : Tue Oct 26 14:06:53 2021
> > >       Raid Level : raid5
> > >     Raid Devices : 5
> > > 
> > >   Avail Dev Size : 31251492831 (14901.87 GiB 16000.76 GB)
> > >       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
> > >    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
> > >      Data Offset : 264192 sectors
> > >       New Offset : 261120 sectors
> > >     Super Offset : 8 sectors
> > >            State : clean
> > >      Device UUID : 79a3dff4:c53f9071:f9c1c262:403fbc10
> > > 
> > > Internal Bitmap : 8 sectors from superblock
> > >    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
> > >    Delta Devices : 1 (4->5)
> > > 
> > >      Update Time : Thu May  4 14:38:38 2023
> > >    Bad Block Log : 512 entries available at offset 72 sectors
> > >         Checksum : 112fbe09 - correct
> > >           Events : 78712
> > > 
> > >           Layout : left-symmetric
> > >       Chunk Size : 512K
> > > 
> > >     Device Role : Active device 2
> > >     Array State : AAAA. ('A' == active, '.' == missing, 'R' ==
> > > replacing)
> 
> I have no idle why other disk shows that device 2 is missing, and what
> is device 4.
> 
> Anyway, can you try the following?
> 
> mdadm -I /dev/sdc1
> mdadm -D /dev/mdxxx
> 
> mdadm -I /dev/sdd1
> mdadm -D /dev/mdxxx
> 
> mdadm -I /dev/sde1
> mdadm -D /dev/mdxxx
> 
> mdadm -I /dev/sdf1
> mdadm -D /dev/mdxxx
> 
> If above works well, you can try:
> 
> mdadm -R /dev/mdxxx, and see if the array can be started.
> 
> Thanks,
> Kuai
> > > /dev/sdf1:
> > >            Magic : a92b4efc
> > >          Version : 1.2
> > >      Feature Map : 0x45
> > >       Array UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >             Name : GRANDSLAM:480
> > >    Creation Time : Tue Oct 26 14:06:53 2021
> > >       Raid Level : raid5
> > >     Raid Devices : 5
> > > 
> > >   Avail Dev Size : 31251492926 (14901.87 GiB 16000.76 GB)
> > >       Array Size : 62502983680 (59607.49 GiB 64003.06 GB)
> > >    Used Dev Size : 31251491840 (14901.87 GiB 16000.76 GB)
> > >      Data Offset : 264192 sectors
> > >       New Offset : 261120 sectors
> > >     Super Offset : 8 sectors
> > >            State : clean
> > >      Device UUID : 9d9c1c0d:030844a7:f365ace6:5e568930
> > > 
> > > Internal Bitmap : 8 sectors from superblock
> > >    Reshape pos'n : 51850891264 (49448.86 GiB 53095.31 GB)
> > >    Delta Devices : 1 (4->5)
> > > 
> > >      Update Time : Thu May  4 14:39:03 2023
> > >    Bad Block Log : 512 entries available at offset 72 sectors
> > >         Checksum : 2d33aff - correct
> > >           Events : 78714
> > > 
> > >           Layout : left-symmetric
> > >       Chunk Size : 512K
> > > 
> > >     Device Role : Active device 3
> > >     Array State : AA.A. ('A' == active, '.' == missing, 'R' ==
> > > replacing)
> > > ---
> > > mdadm -E /dev/sd[c-f]1 | grep -E '^/dev/sd|Update'
> > > /dev/sdc1:
> > >      Update Time : Thu May  4 14:39:03 2023
> > > /dev/sdd1:
> > >      Update Time : Thu May  4 14:39:03 2023
> > > /dev/sde1:
> > >      Update Time : Thu May  4 14:38:38 2023
> > > /dev/sdf1:
> > >      Update Time : Thu May  4 14:39:03 2023
> > > ---
> > > mdadm --assemble --scan
> > > mdadm: /dev/md/GRANDSLAM:480 assembled from 3 drives - not enough to
> > > start the array.
> > > ---
> > > /etc/mdadm/mdadm.conf
> > > # This configuration was auto-generated on Tue, 26 Oct 2021 12:52:33
> > > -0500 by mkconf
> > > ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480
> > > UUID=20211025:02005a7a:5a7abeef:cafebabe
> > > ---
> > > 
> > > NOTE: Raid Level is now shown below to be raid0. This is a RAID5.
> > >        Delta Devices are munged?
> > > 
> > > NOW;mdadm -D /dev/md480
> > >   2023.05.17 02:44:06 AM
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 4
> > >         Persistence : Superblock is persistent
> > > 
> > >               State : inactive
> > >     Working Devices : 4
> > > 
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > > 
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > > 
> > >      Number   Major   Minor   RaidDevice
> > > 
> > >         -       8       81        -        /dev/sdf1
> > >         -       8       65        -        /dev/sde1
> > >         -       8       49        -        /dev/sdd1
> > >         -       8       33        -        /dev/sdc1
> > > ---
> > > 
> > > NOTE: The HITACHI MG08ACA16TE drives default to DISABLED
> > >        I've since enabled the setting if this helps.
> > > 
> > > smartctl -l scterc /dev/sdc; smartctl -l scterc /dev/sdd; smartctl -l
> > > scterc /dev/sde; smartctl -l scterc /dev/sdf
> > > 
> > > smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
> > > build)
> > > Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> > > www.smartmontools.org
> > > 
> > > SCT Error Recovery Control:
> > >             Read:     70 (7.0 seconds)
> > >            Write:     70 (7.0 seconds)
> > > 
> > > smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
> > > build)
> > > Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> > > www.smartmontools.org
> > > 
> > > SCT Error Recovery Control:
> > >             Read:     70 (7.0 seconds)
> > >            Write:     70 (7.0 seconds)
> > > 
> > > smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
> > > build)
> > > Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> > > www.smartmontools.org
> > > 
> > > SCT Error Recovery Control:
> > >             Read:     70 (7.0 seconds)
> > >            Write:     70 (7.0 seconds)
> > > 
> > > smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-24-amd64] (local
> > > build)
> > > Copyright (C) 2002-17, Bruce Allen, Christian Franke,
> > > www.smartmontools.org
> > > 
> > > SCT Error Recovery Control:
> > >             Read:     70 (7.0 seconds)
> > >            Write:     70 (7.0 seconds)
> > > 
> > > ---
> > > 
> > > Exhausted and maybe I'm just looking for someone to suggest running the
> > > command that I really don't want to run yet.
> > > 
> > > Enabling Loss Of Confusion flag hasn't worked either.
> > > 
> > 
> > .
> > 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
  2023-05-22  6:56     ` raid
@ 2023-05-22  7:51       ` Yu Kuai
  2023-05-22 19:50         ` raid
  0 siblings, 1 reply; 9+ messages in thread
From: Yu Kuai @ 2023-05-22  7:51 UTC (permalink / raw)
  To: raid, Yu Kuai, Wol, linux-raid; +Cc: Phil Turmel, NeilBrown, yukuai (C)

Hi,

在 2023/05/22 14:56, raid 写道:
> Hi,
> Thanks for the guidance as the current state has at least changed somewhat.
> 
> BTW Sorry about Life getting in the way of tech. =) Reason for my delayed response.
> 
> -sudo mdadm -I /dev/sdc1
> mdadm: /dev/sdc1 attached to /dev/md480, not enough to start (1).
> -sudo mdadm -D /dev/md480
> /dev/md480:
>             Version : 1.2
>          Raid Level : raid0
>       Total Devices : 1
>         Persistence : Superblock is persistent
> 
>               State : inactive
>     Working Devices : 1
> 
>       Delta Devices : 1, (-1->0)
>           New Level : raid5
>          New Layout : left-symmetric
>       New Chunksize : 512K
> 
>                Name : GRANDSLAM:480
>                UUID : 20211025:02005a7a:5a7abeef:cafebabe
>              Events : 78714
> 
>      Number   Major   Minor   RaidDevice
> 
>         -       8       33        -        /dev/sdc1
> -sudo mdadm -I /dev/sdd1
> mdadm: /dev/sdd1 attached to /dev/md480, not enough to start (2).
> -sudo mdadm -D /dev/md480
> /dev/md480:
>             Version : 1.2
>          Raid Level : raid0
>       Total Devices : 2
>         Persistence : Superblock is persistent
> 
>               State : inactive
>     Working Devices : 2
> 
>       Delta Devices : 1, (-1->0)
>           New Level : raid5
>          New Layout : left-symmetric
>       New Chunksize : 512K
> 
>                Name : GRANDSLAM:480
>                UUID : 20211025:02005a7a:5a7abeef:cafebabe
>              Events : 78714
> 
>      Number   Major   Minor   RaidDevice
> 
>         -       8       49        -        /dev/sdd1
>         -       8       33        -        /dev/sdc1
> -sudo mdadm -I /dev/sde1
> mdadm: /dev/sde1 attached to /dev/md480, not enough to start (2).
> -sudo mdadm -D /dev/md480
> /dev/md480:
>             Version : 1.2
>          Raid Level : raid0
>       Total Devices : 3
>         Persistence : Superblock is persistent
> 
>               State : inactive
>     Working Devices : 3
> 
>       Delta Devices : 1, (-1->0)
>           New Level : raid5
>          New Layout : left-symmetric
>       New Chunksize : 512K
> 
>                Name : GRANDSLAM:480
>                UUID : 20211025:02005a7a:5a7abeef:cafebabe
>              Events : 78712
> 
>      Number   Major   Minor   RaidDevice
> 
>         -       8       65        -        /dev/sde1
>         -       8       49        -        /dev/sdd1
>         -       8       33        -        /dev/sdc1
> -sudo mdadm -I /dev/sdf1
> mdadm: /dev/sdf1 attached to /dev/md480, not enough to start (3).
> -sudo mdadm -D /dev/md480
> /dev/md480:
>             Version : 1.2
>          Raid Level : raid0
>       Total Devices : 4
>         Persistence : Superblock is persistent
> 
>               State : inactive
>     Working Devices : 4
> 
>       Delta Devices : 1, (-1->0)
>           New Level : raid5
>          New Layout : left-symmetric
>       New Chunksize : 512K
> 
>                Name : GRANDSLAM:480
>                UUID : 20211025:02005a7a:5a7abeef:cafebabe
>              Events : 78714
> 
>      Number   Major   Minor   RaidDevice
> 
>         -       8       81        -        /dev/sdf1
>         -       8       65        -        /dev/sde1
>         -       8       49        -        /dev/sdd1
>         -       8       33        -        /dev/sdc1
> -sudo mdadm -R /dev/md480
> mdadm: failed to start array /dev/md480: Input/output error
> ---
> NOTE: Of additional interest...
> ---
> -sudo mdadm -D /dev/md480
> /dev/md480:
>             Version : 1.2
>       Creation Time : Tue Oct 26 14:06:53 2021
>          Raid Level : raid5
>       Used Dev Size : 18446744073709551615
>        Raid Devices : 5
>       Total Devices : 3
>         Persistence : Superblock is persistent
> 
>         Update Time : Thu May  4 14:39:03 2023
>               State : active, FAILED, Not Started
>      Active Devices : 3
>     Working Devices : 3
>      Failed Devices : 0
>       Spare Devices : 0
> 
>              Layout : left-symmetric
>          Chunk Size : 512K
> 
> Consistency Policy : unknown
> 
>       Delta Devices : 1, (4->5)
> 
>                Name : GRANDSLAM:480
>                UUID : 20211025:02005a7a:5a7abeef:cafebabe
>              Events : 78714
> 
>      Number   Major   Minor   RaidDevice State
>         -       0        0        0      removed
>         -       0        0        1      removed
>         -       0        0        2      removed
>         -       0        0        3      removed
>         -       0        0        4      removed
> 
>         -       8       81        3      sync   /dev/sdf1
>         -       8       49        1      sync   /dev/sdd1
>         -       8       33        0      sync   /dev/sdc1

So the reason that this array can't start is that /dev/sde1 is not
recognized as RaidDevice 2, and there are two RaidDevice missing for
a raid5.

Sadly I have no idea to workaroud this, sb metadate seems to be broken.

Thanks,
Kuai
> 
> ---
> -watch -c -d -n 1 cat /proc/mdstat
> ---
> Every 1.0s: cat /proc/mdstat                                                     OAK2023: Mon May 22 01:48:24 2023
> 
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> md480 : inactive sdf1[4] sdd1[1] sdc1[0]
>        46877239294 blocks super 1.2
> 
> unused devices: <none>
> ---
> Hopeful that is some progress towards an array start? It's definately unexpected output to me.
> I/O Error starting md480
> 
> Thanks!
> SA
> 
> On Thu, 2023-05-18 at 11:15 +0800, Yu Kuai wrote:
> 
>> I have no idle why other disk shows that device 2 is missing, and what
>> is device 4.
>>
>> Anyway, can you try the following?
>>
>> mdadm -I /dev/sdc1
>> mdadm -D /dev/mdxxx
>>
>> mdadm -I /dev/sdd1
>> mdadm -D /dev/mdxxx
>>
>> mdadm -I /dev/sde1
>> mdadm -D /dev/mdxxx
>>
>> mdadm -I /dev/sdf1
>> mdadm -D /dev/mdxxx
>>
>> If above works well, you can try:
>>
>> mdadm -R /dev/mdxxx, and see if the array can be started.
>>
>> Thanks,
>> Kuai
> 
> 
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
  2023-05-22  7:51       ` Yu Kuai
@ 2023-05-22 19:50         ` raid
  2023-05-22 23:50           ` Roger Heflin
  0 siblings, 1 reply; 9+ messages in thread
From: raid @ 2023-05-22 19:50 UTC (permalink / raw)
  To: Yu Kuai, Wol, linux-raid; +Cc: Phil Turmel, NeilBrown, yukuai (C)

Hi

Thanks for your time so far ! Final questions before I rebuild this RAID from scratch.

BTW I created detailed notes when I created this array (as I have for eight other RAIDs that I maintain).
    These notes may be applicable later... Here's why.

Do you think that Zero'ing the drives (as is done for initial drive prep) and then recreating the
RAID5 using the initial settings (originally three drives, NOW four drives) could possibly offer
a greater chance to recover files? As in, more complete file recovery if the striping aligns
correctly? Technically, I've had to write off the files that aren't currently backed up.

However, I'm still willing to make an attempt if you think the idea above might yield something
better than one or two stripes of data per file?

And/Or any other tips for this final attempt? Setting ReadOnly if possible?

Thanks Again
SA

---
Detailed Notes:
============================================================
  2021.10.26 0200P NEW RAID MD480 (48TB) 3x 1600GB HITACHI
========================================================================================================================
= PREPARATION ==

watch -c -d -n 1 cat /proc/mdstat  ############## OPEN A TERMINAL AND MONITOR STATUS ##

sudo lsblk && sudo blkid  ########################################### VERIFY DEVICES ##

sudo umount /MEGARAID                         # Unmount if filesystem is mounted
sudo mdadm --stop /dev/md480                  # Stop the RAID/md480 device
sudo mdadm --zero-superblock /dev/sd[cdf]1    # Zero  the   superblock(s)  on
                                              #      all members of the array
sudo mdadm --remove /dev/md480                # Remove the RAID/md480

Edit  ########################################## OPTIONAL FINALIZE PERMANENT REMOVAL ##
/etc/fstab 
/etc/mdadm/mdadm.conf
Removing referrences to the mounting and the definition of the RAID/MD480 device(s)
NOTE: Some fstab CFG settings allow skipping devices when unavailable at boot. (nofail)

sudo update-initramfs -uv       # -uv  update ; verbose  ########### RESET INITRAMFS ##

======================================================================================== CREATE RAID & ADD FILESYSTEM ==
  MEGARAID 2021.10.26 0200P
##############  RAID5 ARRAY MD480 32TB (32,001,527,644,160 bytes) Available (3x16TB) ##

sudo mdadm --create --verbose /dev/md480 --level=5 --raid-devices=3 --uuid=2021102502005a7a5a7abeefcafebabe
/dev/sd[cdf]1

31,251,491,840 BLOCKS CREATED IN ~20 HOURS

############################################################  CREATE FILESYSTEM EXT4 ##
 -v VERBOSE
 -L DISK LABEL
 -U UUID FORMATTED AS 8CHARS-4CHARS-4CHARS-4CHARS-12CHARS
 -m OVERFLOW PROTECTION PERCENTAGE IE. .025 OF 24,576GB IS ~615MB FREE IS CONSIDERED FULL
 -b BLOCK SIZE 1/4 OF STRIDE= OFFERS BEST OVERALL PERFORMANCE
 -E STRIDE= MULTIPLE OF 8
    STRIPE-WIDTH= STRIDE X 2

sudo mkfs.ext4 -v -L MEGARAID    -U 20211028-0500-5a7a-5a7a-beefcafebabe -m .025 -b 4096 -E stride=32,stripe-width=64
/dev/md480

sudo mkdir  /MEGARAID  ; sudo chown adminx:adminx -R /MEGARAID

##############################################################  SET CORRECT HOMEHOST ##

sudo umount /MEGARAID
sudo mdadm --stop /dev/md480
sudo mdadm --assemble --update=homehost --homehost=GRANDSLAM /dev/md480 /dev/sd[cdf]1
sudo blkid

/dev/sdc1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
           UUID_SUB="8f0835db-3ea2-4540-2ab4-232d6203d1b7"
           LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
           PARTLABEL="HIT*16TB*001*RAID5"
           PARTUUID="3b68fe63-35d0-404d-912e-dfe1127f109b"

/dev/sdd1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
           UUID_SUB="b4660f49-867b-9f1e-ecad-0acec7119c37"
           LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
           PARTLABEL="HIT*16TB*002*RAID5"
           PARTUUID="32c50f4f-f6ce-4309-b8e4-facdb6e05ba8"

/dev/sdf1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
           UUID_SUB="79a3dff4-c53f-9071-f9c1-c262403fbc10"
           LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
           PARTLABEL="HIT*16TB*003*RAID5"
           PARTUUID="7ec27f96-2275-4e09-9013-ac056f11ebfb"

/dev/md480: LABEL="MEGARAID" UUID="20211028-0500-5a7a-5a7a-beefcafebabe" TYPE="ext4"

############################################################### ENTRY FOR /ETC/FSTAB ##

/dev/md480		/MEGARAID		ext4		nofail,noatime,nodiratime,relatime,errors=remount-ro		
0		2

#################################################### ENTRY FOR /ETC/MDADM/MDADM.CONF ##

ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480 UUID=20211025:02005a7a:5a7abeef:cafebabe

#######################################################################################

sudo update-initramfs -uv       # -uv  update ; verbose
sudo mount -a
sudo chown adminx:adminx -R /MEGARAID

############################################################### END 2021.10.28 0545A ##






On Mon, 2023-05-22 at 15:51 +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2023/05/22 14:56, raid 写道:
> > Hi,
> > Thanks for the guidance as the current state has at least changed somewhat.
> > 
> > BTW Sorry about Life getting in the way of tech. =) Reason for my delayed response.
> > 
> > -sudo mdadm -I /dev/sdc1
> > mdadm: /dev/sdc1 attached to /dev/md480, not enough to start (1).
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> >             Version : 1.2
> >          Raid Level : raid0
> >       Total Devices : 1
> >         Persistence : Superblock is persistent
> > 
> >               State : inactive
> >     Working Devices : 1
> > 
> >       Delta Devices : 1, (-1->0)
> >           New Level : raid5
> >          New Layout : left-symmetric
> >       New Chunksize : 512K
> > 
> >                Name : GRANDSLAM:480
> >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> >              Events : 78714
> > 
> >      Number   Major   Minor   RaidDevice
> > 
> >         -       8       33        -        /dev/sdc1
> > -sudo mdadm -I /dev/sdd1
> > mdadm: /dev/sdd1 attached to /dev/md480, not enough to start (2).
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> >             Version : 1.2
> >          Raid Level : raid0
> >       Total Devices : 2
> >         Persistence : Superblock is persistent
> > 
> >               State : inactive
> >     Working Devices : 2
> > 
> >       Delta Devices : 1, (-1->0)
> >           New Level : raid5
> >          New Layout : left-symmetric
> >       New Chunksize : 512K
> > 
> >                Name : GRANDSLAM:480
> >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> >              Events : 78714
> > 
> >      Number   Major   Minor   RaidDevice
> > 
> >         -       8       49        -        /dev/sdd1
> >         -       8       33        -        /dev/sdc1
> > -sudo mdadm -I /dev/sde1
> > mdadm: /dev/sde1 attached to /dev/md480, not enough to start (2).
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> >             Version : 1.2
> >          Raid Level : raid0
> >       Total Devices : 3
> >         Persistence : Superblock is persistent
> > 
> >               State : inactive
> >     Working Devices : 3
> > 
> >       Delta Devices : 1, (-1->0)
> >           New Level : raid5
> >          New Layout : left-symmetric
> >       New Chunksize : 512K
> > 
> >                Name : GRANDSLAM:480
> >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> >              Events : 78712
> > 
> >      Number   Major   Minor   RaidDevice
> > 
> >         -       8       65        -        /dev/sde1
> >         -       8       49        -        /dev/sdd1
> >         -       8       33        -        /dev/sdc1
> > -sudo mdadm -I /dev/sdf1
> > mdadm: /dev/sdf1 attached to /dev/md480, not enough to start (3).
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> >             Version : 1.2
> >          Raid Level : raid0
> >       Total Devices : 4
> >         Persistence : Superblock is persistent
> > 
> >               State : inactive
> >     Working Devices : 4
> > 
> >       Delta Devices : 1, (-1->0)
> >           New Level : raid5
> >          New Layout : left-symmetric
> >       New Chunksize : 512K
> > 
> >                Name : GRANDSLAM:480
> >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> >              Events : 78714
> > 
> >      Number   Major   Minor   RaidDevice
> > 
> >         -       8       81        -        /dev/sdf1
> >         -       8       65        -        /dev/sde1
> >         -       8       49        -        /dev/sdd1
> >         -       8       33        -        /dev/sdc1
> > -sudo mdadm -R /dev/md480
> > mdadm: failed to start array /dev/md480: Input/output error
> > ---
> > NOTE: Of additional interest...
> > ---
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> >             Version : 1.2
> >       Creation Time : Tue Oct 26 14:06:53 2021
> >          Raid Level : raid5
> >       Used Dev Size : 18446744073709551615
> >        Raid Devices : 5
> >       Total Devices : 3
> >         Persistence : Superblock is persistent
> > 
> >         Update Time : Thu May  4 14:39:03 2023
> >               State : active, FAILED, Not Started
> >      Active Devices : 3
> >     Working Devices : 3
> >      Failed Devices : 0
> >       Spare Devices : 0
> > 
> >              Layout : left-symmetric
> >          Chunk Size : 512K
> > 
> > Consistency Policy : unknown
> > 
> >       Delta Devices : 1, (4->5)
> > 
> >                Name : GRANDSLAM:480
> >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> >              Events : 78714
> > 
> >      Number   Major   Minor   RaidDevice State
> >         -       0        0        0      removed
> >         -       0        0        1      removed
> >         -       0        0        2      removed
> >         -       0        0        3      removed
> >         -       0        0        4      removed
> > 
> >         -       8       81        3      sync   /dev/sdf1
> >         -       8       49        1      sync   /dev/sdd1
> >         -       8       33        0      sync   /dev/sdc1
> 
> So the reason that this array can't start is that /dev/sde1 is not
> recognized as RaidDevice 2, and there are two RaidDevice missing for
> a raid5.
> 
> Sadly I have no idea to workaroud this, sb metadate seems to be broken.
> 
> Thanks,
> Kuai
> > ---
> > -watch -c -d -n 1 cat /proc/mdstat
> > ---
> > Every 1.0s: cat /proc/mdstat                                                     OAK2023: Mon May 22 01:48:24 2023
> > 
> > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> > md480 : inactive sdf1[4] sdd1[1] sdc1[0]
> >        46877239294 blocks super 1.2
> > 
> > unused devices: <none>
> > ---
> > Hopeful that is some progress towards an array start? It's definately unexpected output to me.
> > I/O Error starting md480
> > 
> > Thanks!
> > SA
> > 
> > On Thu, 2023-05-18 at 11:15 +0800, Yu Kuai wrote:
> > 
> > > I have no idle why other disk shows that device 2 is missing, and what
> > > is device 4.
> > > 
> > > Anyway, can you try the following?
> > > 
> > > mdadm -I /dev/sdc1
> > > mdadm -D /dev/mdxxx
> > > 
> > > mdadm -I /dev/sdd1
> > > mdadm -D /dev/mdxxx
> > > 
> > > mdadm -I /dev/sde1
> > > mdadm -D /dev/mdxxx
> > > 
> > > mdadm -I /dev/sdf1
> > > mdadm -D /dev/mdxxx
> > > 
> > > If above works well, you can try:
> > > 
> > > mdadm -R /dev/mdxxx, and see if the array can be started.
> > > 
> > > Thanks,
> > > Kuai
> > 
> > 
> > 
> > .
> > 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
  2023-05-22 19:50         ` raid
@ 2023-05-22 23:50           ` Roger Heflin
  2023-05-23  5:04             ` raid
  0 siblings, 1 reply; 9+ messages in thread
From: Roger Heflin @ 2023-05-22 23:50 UTC (permalink / raw)
  To: raid; +Cc: Yu Kuai, Wol, linux-raid, Phil Turmel, NeilBrown, yukuai (C)

Given what the array is reporting I would doubt that is going to fix
anything.    The array being in the middle of a reshape makes it
likely that neither n or n-1 is the right raid config for at least 1/2
of the data, so it is likely the filesystem will be completely broken.

Right now the array reports it is a 5 disk array, and the array data
says it was going from 4 disks to 5.

What was the command you used to add the 4th disk?     No one is sure
based on what you are saying how exactly the array got into this
state.   The data being shown disagrees with what you are reporting,
and given that no one knows what actually happened.

On Mon, May 22, 2023 at 3:18 PM raid <raid@electrons.cloud> wrote:
>
> Hi
>
> Thanks for your time so far ! Final questions before I rebuild this RAID from scratch.
>
> BTW I created detailed notes when I created this array (as I have for eight other RAIDs that I maintain).
>     These notes may be applicable later... Here's why.
>
> Do you think that Zero'ing the drives (as is done for initial drive prep) and then recreating the
> RAID5 using the initial settings (originally three drives, NOW four drives) could possibly offer
> a greater chance to recover files? As in, more complete file recovery if the striping aligns
> correctly? Technically, I've had to write off the files that aren't currently backed up.
>
> However, I'm still willing to make an attempt if you think the idea above might yield something
> better than one or two stripes of data per file?
>
> And/Or any other tips for this final attempt? Setting ReadOnly if possible?
>
> Thanks Again
> SA
>
> ---
> Detailed Notes:
> ============================================================
>   2021.10.26 0200P NEW RAID MD480 (48TB) 3x 1600GB HITACHI
> ========================================================================================================================
> = PREPARATION ==
>
> watch -c -d -n 1 cat /proc/mdstat  ############## OPEN A TERMINAL AND MONITOR STATUS ##
>
> sudo lsblk && sudo blkid  ########################################### VERIFY DEVICES ##
>
> sudo umount /MEGARAID                         # Unmount if filesystem is mounted
> sudo mdadm --stop /dev/md480                  # Stop the RAID/md480 device
> sudo mdadm --zero-superblock /dev/sd[cdf]1    # Zero  the   superblock(s)  on
>                                               #      all members of the array
> sudo mdadm --remove /dev/md480                # Remove the RAID/md480
>
> Edit  ########################################## OPTIONAL FINALIZE PERMANENT REMOVAL ##
> /etc/fstab
> /etc/mdadm/mdadm.conf
> Removing referrences to the mounting and the definition of the RAID/MD480 device(s)
> NOTE: Some fstab CFG settings allow skipping devices when unavailable at boot. (nofail)
>
> sudo update-initramfs -uv       # -uv  update ; verbose  ########### RESET INITRAMFS ##
>
> ======================================================================================== CREATE RAID & ADD FILESYSTEM ==
>   MEGARAID 2021.10.26 0200P
> ##############  RAID5 ARRAY MD480 32TB (32,001,527,644,160 bytes) Available (3x16TB) ##
>
> sudo mdadm --create --verbose /dev/md480 --level=5 --raid-devices=3 --uuid=2021102502005a7a5a7abeefcafebabe
> /dev/sd[cdf]1
>
> 31,251,491,840 BLOCKS CREATED IN ~20 HOURS
>
> ############################################################  CREATE FILESYSTEM EXT4 ##
>  -v VERBOSE
>  -L DISK LABEL
>  -U UUID FORMATTED AS 8CHARS-4CHARS-4CHARS-4CHARS-12CHARS
>  -m OVERFLOW PROTECTION PERCENTAGE IE. .025 OF 24,576GB IS ~615MB FREE IS CONSIDERED FULL
>  -b BLOCK SIZE 1/4 OF STRIDE= OFFERS BEST OVERALL PERFORMANCE
>  -E STRIDE= MULTIPLE OF 8
>     STRIPE-WIDTH= STRIDE X 2
>
> sudo mkfs.ext4 -v -L MEGARAID    -U 20211028-0500-5a7a-5a7a-beefcafebabe -m .025 -b 4096 -E stride=32,stripe-width=64
> /dev/md480
>
> sudo mkdir  /MEGARAID  ; sudo chown adminx:adminx -R /MEGARAID
>
> ##############################################################  SET CORRECT HOMEHOST ##
>
> sudo umount /MEGARAID
> sudo mdadm --stop /dev/md480
> sudo mdadm --assemble --update=homehost --homehost=GRANDSLAM /dev/md480 /dev/sd[cdf]1
> sudo blkid
>
> /dev/sdc1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
>            UUID_SUB="8f0835db-3ea2-4540-2ab4-232d6203d1b7"
>            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
>            PARTLABEL="HIT*16TB*001*RAID5"
>            PARTUUID="3b68fe63-35d0-404d-912e-dfe1127f109b"
>
> /dev/sdd1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
>            UUID_SUB="b4660f49-867b-9f1e-ecad-0acec7119c37"
>            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
>            PARTLABEL="HIT*16TB*002*RAID5"
>            PARTUUID="32c50f4f-f6ce-4309-b8e4-facdb6e05ba8"
>
> /dev/sdf1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
>            UUID_SUB="79a3dff4-c53f-9071-f9c1-c262403fbc10"
>            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
>            PARTLABEL="HIT*16TB*003*RAID5"
>            PARTUUID="7ec27f96-2275-4e09-9013-ac056f11ebfb"
>
> /dev/md480: LABEL="MEGARAID" UUID="20211028-0500-5a7a-5a7a-beefcafebabe" TYPE="ext4"
>
> ############################################################### ENTRY FOR /ETC/FSTAB ##
>
> /dev/md480              /MEGARAID               ext4            nofail,noatime,nodiratime,relatime,errors=remount-ro
> 0               2
>
> #################################################### ENTRY FOR /ETC/MDADM/MDADM.CONF ##
>
> ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480 UUID=20211025:02005a7a:5a7abeef:cafebabe
>
> #######################################################################################
>
> sudo update-initramfs -uv       # -uv  update ; verbose
> sudo mount -a
> sudo chown adminx:adminx -R /MEGARAID
>
> ############################################################### END 2021.10.28 0545A ##
>
>
>
>
>
>
> On Mon, 2023-05-22 at 15:51 +0800, Yu Kuai wrote:
> > Hi,
> >
> > 在 2023/05/22 14:56, raid 写道:
> > > Hi,
> > > Thanks for the guidance as the current state has at least changed somewhat.
> > >
> > > BTW Sorry about Life getting in the way of tech. =) Reason for my delayed response.
> > >
> > > -sudo mdadm -I /dev/sdc1
> > > mdadm: /dev/sdc1 attached to /dev/md480, not enough to start (1).
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 1
> > >         Persistence : Superblock is persistent
> > >
> > >               State : inactive
> > >     Working Devices : 1
> > >
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > >
> > >      Number   Major   Minor   RaidDevice
> > >
> > >         -       8       33        -        /dev/sdc1
> > > -sudo mdadm -I /dev/sdd1
> > > mdadm: /dev/sdd1 attached to /dev/md480, not enough to start (2).
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 2
> > >         Persistence : Superblock is persistent
> > >
> > >               State : inactive
> > >     Working Devices : 2
> > >
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > >
> > >      Number   Major   Minor   RaidDevice
> > >
> > >         -       8       49        -        /dev/sdd1
> > >         -       8       33        -        /dev/sdc1
> > > -sudo mdadm -I /dev/sde1
> > > mdadm: /dev/sde1 attached to /dev/md480, not enough to start (2).
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 3
> > >         Persistence : Superblock is persistent
> > >
> > >               State : inactive
> > >     Working Devices : 3
> > >
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78712
> > >
> > >      Number   Major   Minor   RaidDevice
> > >
> > >         -       8       65        -        /dev/sde1
> > >         -       8       49        -        /dev/sdd1
> > >         -       8       33        -        /dev/sdc1
> > > -sudo mdadm -I /dev/sdf1
> > > mdadm: /dev/sdf1 attached to /dev/md480, not enough to start (3).
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >          Raid Level : raid0
> > >       Total Devices : 4
> > >         Persistence : Superblock is persistent
> > >
> > >               State : inactive
> > >     Working Devices : 4
> > >
> > >       Delta Devices : 1, (-1->0)
> > >           New Level : raid5
> > >          New Layout : left-symmetric
> > >       New Chunksize : 512K
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > >
> > >      Number   Major   Minor   RaidDevice
> > >
> > >         -       8       81        -        /dev/sdf1
> > >         -       8       65        -        /dev/sde1
> > >         -       8       49        -        /dev/sdd1
> > >         -       8       33        -        /dev/sdc1
> > > -sudo mdadm -R /dev/md480
> > > mdadm: failed to start array /dev/md480: Input/output error
> > > ---
> > > NOTE: Of additional interest...
> > > ---
> > > -sudo mdadm -D /dev/md480
> > > /dev/md480:
> > >             Version : 1.2
> > >       Creation Time : Tue Oct 26 14:06:53 2021
> > >          Raid Level : raid5
> > >       Used Dev Size : 18446744073709551615
> > >        Raid Devices : 5
> > >       Total Devices : 3
> > >         Persistence : Superblock is persistent
> > >
> > >         Update Time : Thu May  4 14:39:03 2023
> > >               State : active, FAILED, Not Started
> > >      Active Devices : 3
> > >     Working Devices : 3
> > >      Failed Devices : 0
> > >       Spare Devices : 0
> > >
> > >              Layout : left-symmetric
> > >          Chunk Size : 512K
> > >
> > > Consistency Policy : unknown
> > >
> > >       Delta Devices : 1, (4->5)
> > >
> > >                Name : GRANDSLAM:480
> > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > >              Events : 78714
> > >
> > >      Number   Major   Minor   RaidDevice State
> > >         -       0        0        0      removed
> > >         -       0        0        1      removed
> > >         -       0        0        2      removed
> > >         -       0        0        3      removed
> > >         -       0        0        4      removed
> > >
> > >         -       8       81        3      sync   /dev/sdf1
> > >         -       8       49        1      sync   /dev/sdd1
> > >         -       8       33        0      sync   /dev/sdc1
> >
> > So the reason that this array can't start is that /dev/sde1 is not
> > recognized as RaidDevice 2, and there are two RaidDevice missing for
> > a raid5.
> >
> > Sadly I have no idea to workaroud this, sb metadate seems to be broken.
> >
> > Thanks,
> > Kuai
> > > ---
> > > -watch -c -d -n 1 cat /proc/mdstat
> > > ---
> > > Every 1.0s: cat /proc/mdstat                                                     OAK2023: Mon May 22 01:48:24 2023
> > >
> > > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> > > md480 : inactive sdf1[4] sdd1[1] sdc1[0]
> > >        46877239294 blocks super 1.2
> > >
> > > unused devices: <none>
> > > ---
> > > Hopeful that is some progress towards an array start? It's definately unexpected output to me.
> > > I/O Error starting md480
> > >
> > > Thanks!
> > > SA
> > >
> > > On Thu, 2023-05-18 at 11:15 +0800, Yu Kuai wrote:
> > >
> > > > I have no idle why other disk shows that device 2 is missing, and what
> > > > is device 4.
> > > >
> > > > Anyway, can you try the following?
> > > >
> > > > mdadm -I /dev/sdc1
> > > > mdadm -D /dev/mdxxx
> > > >
> > > > mdadm -I /dev/sdd1
> > > > mdadm -D /dev/mdxxx
> > > >
> > > > mdadm -I /dev/sde1
> > > > mdadm -D /dev/mdxxx
> > > >
> > > > mdadm -I /dev/sdf1
> > > > mdadm -D /dev/mdxxx
> > > >
> > > > If above works well, you can try:
> > > >
> > > > mdadm -R /dev/mdxxx, and see if the array can be started.
> > > >
> > > > Thanks,
> > > > Kuai
> > >
> > >
> > >
> > > .
> > >
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
  2023-05-22 23:50           ` Roger Heflin
@ 2023-05-23  5:04             ` raid
  0 siblings, 0 replies; 9+ messages in thread
From: raid @ 2023-05-23  5:04 UTC (permalink / raw)
  To: Roger Heflin; +Cc: Yu Kuai, Wol, linux-raid, Phil Turmel, NeilBrown, yukuai (C)

Hi

The command that I used was straight forward, I believe.

Here goes, Hopefully this helps clearify (best that I can recall since the terminal history
was destroyed.)

sync; sudo umount /MEGARAID
sudo mdadm -S /dev/md480
sudo mdadm --add /dev/md480 /dev/sde1 ## New drive was prepped and /dev/sde initially
sudo mdadm --grow --raid-devices=4 /dev/md480

I have a function within .bashrc watchraid(){ watch -c -d -n 1 cat /proc/mdstat ;} 
So, that is running all the time and it reported ~4-5 days to complete.

I paused it successfully using either...
echo "frozen" > /sys/block/md480/md/sync_action
OR
echo "idle" >  /sys/block/md480/md/sync_action
(No history available.)

Used rsync to copy ~450GB from network share onto the idling raid.
When rsync completed sucessfully I attempted to rename the rsync log file using Thunar.
It flashed back to the file listing without renaming the file.
So, I tried exact same name again. Same failure. 

Within seconds, every open application/terminal window stopped responding.
(Looking back I probably should have tried a TTY terminal.) Instead I pushed reset button.
Reshaping was at ~30% and stalled. X was completely unresponsive. Hit the reset switch.

The reboot was absolutely normal. The RAID was operational and filesystem intact.
Here's where it gets fuzzy, panic was setting in because the drives didn't look right in
the watch terminal. There were five drives now. One showing failed/removed. I can't recall
if it started reshaping automatically. It shouldn't have. I'm now beginning to believe that
I made a sloppy typo while restarting the reshape manually? In any event, I continued with
routine daily activities while the RAID reshaped. Alot happens in 3-5 days while reshaping.

At ~80% the speed slowed to 0.
sudo umount /MEGARAID
sudo mdadm -S /dev/md480

I had to add a new card to the machine; 1TB NVMe on a PCIe card.
Shutdown. Added card. Booted all looked normal.
However, the filesystem on the raid couldn't auto mount.

watch -c -d -n 1 cat /proc/mdstat

All previously active drives were now listed as spares. So, Life goes on. And here I am now.
The 1TB NVMe that I added is formatted, fully functional and hasn't interfered.

Anyhow, After I create a RAID, I prefer to let mdadm handle itself. In other words, I avoid
using the --force'ing optino. mdadm has been very robust over the years.

I can only conclude that I made a sloppy typo. Right now I'm just trying to deal with the
current state of the array.

I appreciate that your free time has great value. Thank you.
SA

Oh, kind of a silver lining that maybe of interest, I found that the ~7,000 scanned documents
were cataloged/THUMNAILED in a photo album within the digiKam program on a non raid partition.
Most thumbnails are abolutely legible, so, I have high confidence of originals that are lost.

Cheers for having a few luck attributes.

On Mon, 2023-05-22 at 18:50 -0500, Roger Heflin wrote:
> Given what the array is reporting I would doubt that is going to fix
> anything.    The array being in the middle of a reshape makes it
> likely that neither n or n-1 is the right raid config for at least 1/2
> of the data, so it is likely the filesystem will be completely broken.
> 
> Right now the array reports it is a 5 disk array, and the array data
> says it was going from 4 disks to 5.
> 
> What was the command you used to add the 4th disk?     No one is sure
> based on what you are saying how exactly the array got into this
> state.   The data being shown disagrees with what you are reporting,
> and given that no one knows what actually happened.
> 
> On Mon, May 22, 2023 at 3:18 PM raid <raid@electrons.cloud> wrote:
> > Hi
> > 
> > Thanks for your time so far ! Final questions before I rebuild this RAID from scratch.
> > 
> > BTW I created detailed notes when I created this array (as I have for eight other RAIDs that I maintain).
> >     These notes may be applicable later... Here's why.
> > 
> > Do you think that Zero'ing the drives (as is done for initial drive prep) and then recreating the
> > RAID5 using the initial settings (originally three drives, NOW four drives) could possibly offer
> > a greater chance to recover files? As in, more complete file recovery if the striping aligns
> > correctly? Technically, I've had to write off the files that aren't currently backed up.
> > 
> > However, I'm still willing to make an attempt if you think the idea above might yield something
> > better than one or two stripes of data per file?
> > 
> > And/Or any other tips for this final attempt? Setting ReadOnly if possible?
> > 
> > Thanks Again
> > SA
> > 
> > ---
> > Detailed Notes:
> > ============================================================
> >   2021.10.26 0200P NEW RAID MD480 (48TB) 3x 1600GB HITACHI
> > ====================================================================================================================
> > ====
> > = PREPARATION ==
> > 
> > watch -c -d -n 1 cat /proc/mdstat  ############## OPEN A TERMINAL AND MONITOR STATUS ##
> > 
> > sudo lsblk && sudo blkid  ########################################### VERIFY DEVICES ##
> > 
> > sudo umount /MEGARAID                         # Unmount if filesystem is mounted
> > sudo mdadm --stop /dev/md480                  # Stop the RAID/md480 device
> > sudo mdadm --zero-superblock /dev/sd[cdf]1    # Zero  the   superblock(s)  on
> >                                               #      all members of the array
> > sudo mdadm --remove /dev/md480                # Remove the RAID/md480
> > 
> > Edit  ########################################## OPTIONAL FINALIZE PERMANENT REMOVAL ##
> > /etc/fstab
> > /etc/mdadm/mdadm.conf
> > Removing referrences to the mounting and the definition of the RAID/MD480 device(s)
> > NOTE: Some fstab CFG settings allow skipping devices when unavailable at boot. (nofail)
> > 
> > sudo update-initramfs -uv       # -uv  update ; verbose  ########### RESET INITRAMFS ##
> > 
> > ======================================================================================== CREATE RAID & ADD
> > FILESYSTEM ==
> >   MEGARAID 2021.10.26 0200P
> > ##############  RAID5 ARRAY MD480 32TB (32,001,527,644,160 bytes) Available (3x16TB) ##
> > 
> > sudo mdadm --create --verbose /dev/md480 --level=5 --raid-devices=3 --uuid=2021102502005a7a5a7abeefcafebabe
> > /dev/sd[cdf]1
> > 
> > 31,251,491,840 BLOCKS CREATED IN ~20 HOURS
> > 
> > ############################################################  CREATE FILESYSTEM EXT4 ##
> >  -v VERBOSE
> >  -L DISK LABEL
> >  -U UUID FORMATTED AS 8CHARS-4CHARS-4CHARS-4CHARS-12CHARS
> >  -m OVERFLOW PROTECTION PERCENTAGE IE. .025 OF 24,576GB IS ~615MB FREE IS CONSIDERED FULL
> >  -b BLOCK SIZE 1/4 OF STRIDE= OFFERS BEST OVERALL PERFORMANCE
> >  -E STRIDE= MULTIPLE OF 8
> >     STRIPE-WIDTH= STRIDE X 2
> > 
> > sudo mkfs.ext4 -v -L MEGARAID    -U 20211028-0500-5a7a-5a7a-beefcafebabe -m .025 -b 4096 -E stride=32,stripe-
> > width=64
> > /dev/md480
> > 
> > sudo mkdir  /MEGARAID  ; sudo chown adminx:adminx -R /MEGARAID
> > 
> > ##############################################################  SET CORRECT HOMEHOST ##
> > 
> > sudo umount /MEGARAID
> > sudo mdadm --stop /dev/md480
> > sudo mdadm --assemble --update=homehost --homehost=GRANDSLAM /dev/md480 /dev/sd[cdf]1
> > sudo blkid
> > 
> > /dev/sdc1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
> >            UUID_SUB="8f0835db-3ea2-4540-2ab4-232d6203d1b7"
> >            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
> >            PARTLABEL="HIT*16TB*001*RAID5"
> >            PARTUUID="3b68fe63-35d0-404d-912e-dfe1127f109b"
> > 
> > /dev/sdd1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
> >            UUID_SUB="b4660f49-867b-9f1e-ecad-0acec7119c37"
> >            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
> >            PARTLABEL="HIT*16TB*002*RAID5"
> >            PARTUUID="32c50f4f-f6ce-4309-b8e4-facdb6e05ba8"
> > 
> > /dev/sdf1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
> >            UUID_SUB="79a3dff4-c53f-9071-f9c1-c262403fbc10"
> >            LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
> >            PARTLABEL="HIT*16TB*003*RAID5"
> >            PARTUUID="7ec27f96-2275-4e09-9013-ac056f11ebfb"
> > 
> > /dev/md480: LABEL="MEGARAID" UUID="20211028-0500-5a7a-5a7a-beefcafebabe" TYPE="ext4"
> > 
> > ############################################################### ENTRY FOR /ETC/FSTAB ##
> > 
> > /dev/md480              /MEGARAID               ext4            nofail,noatime,nodiratime,relatime,errors=remount-ro
> > 0               2
> > 
> > #################################################### ENTRY FOR /ETC/MDADM/MDADM.CONF ##
> > 
> > ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480 UUID=20211025:02005a7a:5a7abeef:cafebabe
> > 
> > #######################################################################################
> > 
> > sudo update-initramfs -uv       # -uv  update ; verbose
> > sudo mount -a
> > sudo chown adminx:adminx -R /MEGARAID
> > 
> > ############################################################### END 2021.10.28 0545A ##
> > 
> > 
> > 
> > 
> > 
> > 
> > On Mon, 2023-05-22 at 15:51 +0800, Yu Kuai wrote:
> > > Hi,
> > > 
> > > 在 2023/05/22 14:56, raid 写道:
> > > > Hi,
> > > > Thanks for the guidance as the current state has at least changed somewhat.
> > > > 
> > > > BTW Sorry about Life getting in the way of tech. =) Reason for my delayed response.
> > > > 
> > > > -sudo mdadm -I /dev/sdc1
> > > > mdadm: /dev/sdc1 attached to /dev/md480, not enough to start (1).
> > > > -sudo mdadm -D /dev/md480
> > > > /dev/md480:
> > > >             Version : 1.2
> > > >          Raid Level : raid0
> > > >       Total Devices : 1
> > > >         Persistence : Superblock is persistent
> > > > 
> > > >               State : inactive
> > > >     Working Devices : 1
> > > > 
> > > >       Delta Devices : 1, (-1->0)
> > > >           New Level : raid5
> > > >          New Layout : left-symmetric
> > > >       New Chunksize : 512K
> > > > 
> > > >                Name : GRANDSLAM:480
> > > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > > >              Events : 78714
> > > > 
> > > >      Number   Major   Minor   RaidDevice
> > > > 
> > > >         -       8       33        -        /dev/sdc1
> > > > -sudo mdadm -I /dev/sdd1
> > > > mdadm: /dev/sdd1 attached to /dev/md480, not enough to start (2).
> > > > -sudo mdadm -D /dev/md480
> > > > /dev/md480:
> > > >             Version : 1.2
> > > >          Raid Level : raid0
> > > >       Total Devices : 2
> > > >         Persistence : Superblock is persistent
> > > > 
> > > >               State : inactive
> > > >     Working Devices : 2
> > > > 
> > > >       Delta Devices : 1, (-1->0)
> > > >           New Level : raid5
> > > >          New Layout : left-symmetric
> > > >       New Chunksize : 512K
> > > > 
> > > >                Name : GRANDSLAM:480
> > > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > > >              Events : 78714
> > > > 
> > > >      Number   Major   Minor   RaidDevice
> > > > 
> > > >         -       8       49        -        /dev/sdd1
> > > >         -       8       33        -        /dev/sdc1
> > > > -sudo mdadm -I /dev/sde1
> > > > mdadm: /dev/sde1 attached to /dev/md480, not enough to start (2).
> > > > -sudo mdadm -D /dev/md480
> > > > /dev/md480:
> > > >             Version : 1.2
> > > >          Raid Level : raid0
> > > >       Total Devices : 3
> > > >         Persistence : Superblock is persistent
> > > > 
> > > >               State : inactive
> > > >     Working Devices : 3
> > > > 
> > > >       Delta Devices : 1, (-1->0)
> > > >           New Level : raid5
> > > >          New Layout : left-symmetric
> > > >       New Chunksize : 512K
> > > > 
> > > >                Name : GRANDSLAM:480
> > > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > > >              Events : 78712
> > > > 
> > > >      Number   Major   Minor   RaidDevice
> > > > 
> > > >         -       8       65        -        /dev/sde1
> > > >         -       8       49        -        /dev/sdd1
> > > >         -       8       33        -        /dev/sdc1
> > > > -sudo mdadm -I /dev/sdf1
> > > > mdadm: /dev/sdf1 attached to /dev/md480, not enough to start (3).
> > > > -sudo mdadm -D /dev/md480
> > > > /dev/md480:
> > > >             Version : 1.2
> > > >          Raid Level : raid0
> > > >       Total Devices : 4
> > > >         Persistence : Superblock is persistent
> > > > 
> > > >               State : inactive
> > > >     Working Devices : 4
> > > > 
> > > >       Delta Devices : 1, (-1->0)
> > > >           New Level : raid5
> > > >          New Layout : left-symmetric
> > > >       New Chunksize : 512K
> > > > 
> > > >                Name : GRANDSLAM:480
> > > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > > >              Events : 78714
> > > > 
> > > >      Number   Major   Minor   RaidDevice
> > > > 
> > > >         -       8       81        -        /dev/sdf1
> > > >         -       8       65        -        /dev/sde1
> > > >         -       8       49        -        /dev/sdd1
> > > >         -       8       33        -        /dev/sdc1
> > > > -sudo mdadm -R /dev/md480
> > > > mdadm: failed to start array /dev/md480: Input/output error
> > > > ---
> > > > NOTE: Of additional interest...
> > > > ---
> > > > -sudo mdadm -D /dev/md480
> > > > /dev/md480:
> > > >             Version : 1.2
> > > >       Creation Time : Tue Oct 26 14:06:53 2021
> > > >          Raid Level : raid5
> > > >       Used Dev Size : 18446744073709551615
> > > >        Raid Devices : 5
> > > >       Total Devices : 3
> > > >         Persistence : Superblock is persistent
> > > > 
> > > >         Update Time : Thu May  4 14:39:03 2023
> > > >               State : active, FAILED, Not Started
> > > >      Active Devices : 3
> > > >     Working Devices : 3
> > > >      Failed Devices : 0
> > > >       Spare Devices : 0
> > > > 
> > > >              Layout : left-symmetric
> > > >          Chunk Size : 512K
> > > > 
> > > > Consistency Policy : unknown
> > > > 
> > > >       Delta Devices : 1, (4->5)
> > > > 
> > > >                Name : GRANDSLAM:480
> > > >                UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > > >              Events : 78714
> > > > 
> > > >      Number   Major   Minor   RaidDevice State
> > > >         -       0        0        0      removed
> > > >         -       0        0        1      removed
> > > >         -       0        0        2      removed
> > > >         -       0        0        3      removed
> > > >         -       0        0        4      removed
> > > > 
> > > >         -       8       81        3      sync   /dev/sdf1
> > > >         -       8       49        1      sync   /dev/sdd1
> > > >         -       8       33        0      sync   /dev/sdc1
> > > 
> > > So the reason that this array can't start is that /dev/sde1 is not
> > > recognized as RaidDevice 2, and there are two RaidDevice missing for
> > > a raid5.
> > > 
> > > Sadly I have no idea to workaroud this, sb metadate seems to be broken.
> > > 
> > > Thanks,
> > > Kuai
> > > > ---
> > > > -watch -c -d -n 1 cat /proc/mdstat
> > > > ---
> > > > Every 1.0s: cat /proc/mdstat                                                     OAK2023: Mon May 22 01:48:24
> > > > 2023
> > > > 
> > > > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> > > > md480 : inactive sdf1[4] sdd1[1] sdc1[0]
> > > >        46877239294 blocks super 1.2
> > > > 
> > > > unused devices: <none>
> > > > ---
> > > > Hopeful that is some progress towards an array start? It's definately unexpected output to me.
> > > > I/O Error starting md480
> > > > 
> > > > Thanks!
> > > > SA
> > > > 
> > > > On Thu, 2023-05-18 at 11:15 +0800, Yu Kuai wrote:
> > > > 
> > > > > I have no idle why other disk shows that device 2 is missing, and what
> > > > > is device 4.
> > > > > 
> > > > > Anyway, can you try the following?
> > > > > 
> > > > > mdadm -I /dev/sdc1
> > > > > mdadm -D /dev/mdxxx
> > > > > 
> > > > > mdadm -I /dev/sdd1
> > > > > mdadm -D /dev/mdxxx
> > > > > 
> > > > > mdadm -I /dev/sde1
> > > > > mdadm -D /dev/mdxxx
> > > > > 
> > > > > mdadm -I /dev/sdf1
> > > > > mdadm -D /dev/mdxxx
> > > > > 
> > > > > If above works well, you can try:
> > > > > 
> > > > > mdadm -R /dev/mdxxx, and see if the array can be started.
> > > > > 
> > > > > Thanks,
> > > > > Kuai
> > > > 
> > > > .
> > > > 


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-05-23  5:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-17 13:26 RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK) raid
2023-05-17 23:45 ` Wol
2023-05-18  3:15   ` Yu Kuai
2023-05-22  6:56     ` raid
2023-05-22  7:51       ` Yu Kuai
2023-05-22 19:50         ` raid
2023-05-22 23:50           ` Roger Heflin
2023-05-23  5:04             ` raid
2023-05-22  7:20     ` raid

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox