Fwd: Recover RAID5

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Fwd: Recover RAID5
       [not found] <CAA0v=OA8LS0s7G8CHK_0oL=nh49_H4G26-r85GgZLOAtbb3WOg@mail.gmail.com>
@ 2022-11-28 23:33 ` Samuel Lopes
  2022-11-29 22:49   ` Carlos Carvalho
  0 siblings, 1 reply; 3+ messages in thread
From: Samuel Lopes @ 2022-11-28 23:33 UTC (permalink / raw)
  To: linux-raid

Hello everyone,

Before anything else let me apologize for my bad english.

Here is my problem:

I was planning to replace all disks on a 6 drive raid5 array for
bigger ones and then grow it. So I fail/removed the first drive,
replaced it and let the array rebuild. In that process another drive
had problems and got kicked off the array leaving me with 4 working
drives and 2 spares (the new one and the one that was kicked).
After reboot I did a smart test on the kicked drive that showed no
errors (at least for now).
Nevertheless, after trying to assemble the array again those 2 drives
were always marked as spare, even if I fail/re-add/assemble force
them.

And here is where I probably made a great mistake. I tried to create
the array again using the same parameters and drive order but I'm
unable to mount the filesystem.

I have a saved output of --detail and I have the boot logs to check
the correct drive order:

Old --detail output:

           Version : 1.2
     Creation Time : Fri Mar 23 14:09:57 2018
        Raid Level : raid5
        Array Size : 48831518720 (45.48 TiB 50.00 TB)
     Used Dev Size : 9766303744 (9.10 TiB 10.00 TB)
      Raid Devices : 6
     Total Devices : 5
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Wed Jul 20 11:14:14 2022
             State : clean, degraded
    Active Devices : 5
   Working Devices : 5
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : home:4  (local to host home)
              UUID : 2942a897:e0fb246c:a4a823fd:e0a6f0af
            Events : 633890

    Number   Major   Minor   RaidDevice State
       0       8       97        0      active sync   /dev/sdg1
       1       8      113        1      active sync   /dev/sdh1
       2       8      193        2      active sync   /dev/sdm1
       7       8      209        3      active sync   /dev/sdn1
       -       0        0        4      removed
       6       8      225        5      active sync   /dev/sdo1

The drive names are different now but I checked the serial numbers
from the boot log to make sure it was the correct order:

Nov 21 09:10:52 [localhost] kernel: [    2.108665] md/raid:md4: device
sdg1 operational as raid disk 0
Nov 21 09:10:52 [localhost] kernel: [    2.108670] md/raid:md4: device
sdn1 operational as raid disk 3
Nov 21 09:10:52 [localhost] kernel: [    2.108671] md/raid:md4: device
sdm1 operational as raid disk 2
Nov 21 09:10:52 [localhost] kernel: [    2.108673] md/raid:md4: device
sdo1 operational as raid disk 5
Nov 21 09:10:52 [localhost] kernel: [    2.108674] md/raid:md4: device
sdi1 operational as raid disk 1
Nov 21 09:10:52 [localhost] kernel: [    2.108675] md/raid:md4: device
sdp1 operational as raid disk 4

Then I run:

mdadm -v --create /dev/md5 --level=5 --chunk=512K --metadata=1.2
--layout=left-symmetric --data-offset=262144s --raid-devices=6
/dev/sdn1 /dev/sdj1 /dev/sdl1 /dev/sdm1 /dev/sdo1 /dev/sdr1
--assume-clean

I'm not sure about the data-offset but that value makes the array size
match the old --detail output. New --detail output:

           Version : 1.2
     Creation Time : Mon Nov 28 15:07:36 2022
        Raid Level : raid5
        Array Size : 48831518720 (45.48 TiB 50.00 TB)
     Used Dev Size : 9766303744 (9.10 TiB 10.00 TB)
      Raid Devices : 6
     Total Devices : 6
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Mon Nov 28 15:07:36 2022
             State : clean
    Active Devices : 6
   Working Devices : 6
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 512K

Consistency Policy : bitmap

              Name : home:5  (local to host home)
              UUID : 5da28055:2d1ca246:98a0aa83:eba65394
            Events : 1

    Number   Major   Minor   RaidDevice State
       0       8      209        0      active sync   /dev/sdn1
       1       8      145        1      active sync   /dev/sdj1
       2       8      177        2      active sync   /dev/sdl1
       3       8      193        3      active sync   /dev/sdm1
       4       8      225        4      active sync   /dev/sdo1
       5      65       17        5      active sync   /dev/sdr1

But I'm unable to mount:

sudo mount -v -o ro /dev/md5 /media/test
mount: /media/test: wrong fs type, bad option, bad superblock on
/dev/md5, missing codepage or helper program, or other error.
       dmesg(1) may have more information after failed mount system call.

No info at all in dmesg.

I also tried recreating the array with the old drive that I replaced
and a missing member instead of the kicked drive with same output.

I'm pretty lost here so I would welcome any help at all.

Linux home 6.0.0-2-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.0.3-1
(2022-10-21) x86_64 GNU/Linux
mdadm - v4.2 - 2021-12-30 - Debian 4.2-4+b1

Thank you in advance,
Samuel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fwd: Recover RAID5
  2022-11-28 23:33 ` Fwd: Recover RAID5 Samuel Lopes
@ 2022-11-29 22:49   ` Carlos Carvalho
  0 siblings, 0 replies; 3+ messages in thread
From: Carlos Carvalho @ 2022-11-29 22:49 UTC (permalink / raw)
  To: linux-raid

Samuel Lopes (samuelblopes@gmail.com) wrote on Mon, Nov 28, 2022 at 08:33:31PM -03:
> Then I run:
> 
> mdadm -v --create /dev/md5 --level=5 --chunk=512K --metadata=1.2
> --layout=left-symmetric --data-offset=262144s --raid-devices=6
> /dev/sdn1 /dev/sdj1 /dev/sdl1 /dev/sdm1 /dev/sdo1 /dev/sdr1
> --assume-clean
> 
> I'm not sure about the data-offset but that value makes the array size
> match the old --detail output

Well... You'd have to know it exactly. It's strange it's needed, shouldn't be
unless de original command used it. The mdadm versions should also be the same.

You can try to compare the the output of mdadm -E of the disks you know are
good and the new/failed ones to see if there are differences.

Besides this the only suggestion I have is to try different values of
data-offset (starting without this option) to see if any of them makes the
array mountable. Even so the risk of corruption is high, you should fsck.

If you want high reliability the whole installation process is bad, starting
from raid5 use. Next time don't do this way...

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Fwd: Recover RAID5
@ 2022-12-01 19:56 Samuel Lopes
  0 siblings, 0 replies; 3+ messages in thread
From: Samuel Lopes @ 2022-12-01 19:56 UTC (permalink / raw)
  To: linux-raid, carlos

Carlos, thank you for your reply.

Unless I'm missing something the mdadm -E of each disk shows only my
last attempted information.

To make things worse this is a 4y old array and I added/removed
several disks with different mdadm versions so from what I have been
reading it seems each disk can have a different data-offset value
(correct?).

Also, is there a way to know what was the default data-offset for
different versions of mdadm?

Regards,
Samuel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-01 19:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAA0v=OA8LS0s7G8CHK_0oL=nh49_H4G26-r85GgZLOAtbb3WOg@mail.gmail.com>
2022-11-28 23:33 ` Fwd: Recover RAID5 Samuel Lopes
2022-11-29 22:49   ` Carlos Carvalho
2022-12-01 19:56 Samuel Lopes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).