From: raid <raid@electrons.cloud>
To: Yu Kuai <yukuai1@huaweicloud.com>, Wol <antlists@youngman.org.uk>,
linux-raid@vger.kernel.org
Cc: Phil Turmel <philip@turmel.org>, NeilBrown <neilb@suse.com>,
"yukuai (C)" <yukuai3@huawei.com>
Subject: Re: RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK)
Date: Mon, 22 May 2023 14:50:06 -0500 [thread overview]
Message-ID: <3f3fd288e33cccdae32e3f26943218c773589849.camel@electrons.cloud> (raw)
In-Reply-To: <a78c4551-defb-d531-3b5e-372889158f28@huaweicloud.com>
Hi
Thanks for your time so far ! Final questions before I rebuild this RAID from scratch.
BTW I created detailed notes when I created this array (as I have for eight other RAIDs that I maintain).
These notes may be applicable later... Here's why.
Do you think that Zero'ing the drives (as is done for initial drive prep) and then recreating the
RAID5 using the initial settings (originally three drives, NOW four drives) could possibly offer
a greater chance to recover files? As in, more complete file recovery if the striping aligns
correctly? Technically, I've had to write off the files that aren't currently backed up.
However, I'm still willing to make an attempt if you think the idea above might yield something
better than one or two stripes of data per file?
And/Or any other tips for this final attempt? Setting ReadOnly if possible?
Thanks Again
SA
---
Detailed Notes:
============================================================
2021.10.26 0200P NEW RAID MD480 (48TB) 3x 1600GB HITACHI
========================================================================================================================
= PREPARATION ==
watch -c -d -n 1 cat /proc/mdstat ############## OPEN A TERMINAL AND MONITOR STATUS ##
sudo lsblk && sudo blkid ########################################### VERIFY DEVICES ##
sudo umount /MEGARAID # Unmount if filesystem is mounted
sudo mdadm --stop /dev/md480 # Stop the RAID/md480 device
sudo mdadm --zero-superblock /dev/sd[cdf]1 # Zero the superblock(s) on
# all members of the array
sudo mdadm --remove /dev/md480 # Remove the RAID/md480
Edit ########################################## OPTIONAL FINALIZE PERMANENT REMOVAL ##
/etc/fstab
/etc/mdadm/mdadm.conf
Removing referrences to the mounting and the definition of the RAID/MD480 device(s)
NOTE: Some fstab CFG settings allow skipping devices when unavailable at boot. (nofail)
sudo update-initramfs -uv # -uv update ; verbose ########### RESET INITRAMFS ##
======================================================================================== CREATE RAID & ADD FILESYSTEM ==
MEGARAID 2021.10.26 0200P
############## RAID5 ARRAY MD480 32TB (32,001,527,644,160 bytes) Available (3x16TB) ##
sudo mdadm --create --verbose /dev/md480 --level=5 --raid-devices=3 --uuid=2021102502005a7a5a7abeefcafebabe
/dev/sd[cdf]1
31,251,491,840 BLOCKS CREATED IN ~20 HOURS
############################################################ CREATE FILESYSTEM EXT4 ##
-v VERBOSE
-L DISK LABEL
-U UUID FORMATTED AS 8CHARS-4CHARS-4CHARS-4CHARS-12CHARS
-m OVERFLOW PROTECTION PERCENTAGE IE. .025 OF 24,576GB IS ~615MB FREE IS CONSIDERED FULL
-b BLOCK SIZE 1/4 OF STRIDE= OFFERS BEST OVERALL PERFORMANCE
-E STRIDE= MULTIPLE OF 8
STRIPE-WIDTH= STRIDE X 2
sudo mkfs.ext4 -v -L MEGARAID -U 20211028-0500-5a7a-5a7a-beefcafebabe -m .025 -b 4096 -E stride=32,stripe-width=64
/dev/md480
sudo mkdir /MEGARAID ; sudo chown adminx:adminx -R /MEGARAID
############################################################## SET CORRECT HOMEHOST ##
sudo umount /MEGARAID
sudo mdadm --stop /dev/md480
sudo mdadm --assemble --update=homehost --homehost=GRANDSLAM /dev/md480 /dev/sd[cdf]1
sudo blkid
/dev/sdc1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
UUID_SUB="8f0835db-3ea2-4540-2ab4-232d6203d1b7"
LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
PARTLABEL="HIT*16TB*001*RAID5"
PARTUUID="3b68fe63-35d0-404d-912e-dfe1127f109b"
/dev/sdd1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
UUID_SUB="b4660f49-867b-9f1e-ecad-0acec7119c37"
LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
PARTLABEL="HIT*16TB*002*RAID5"
PARTUUID="32c50f4f-f6ce-4309-b8e4-facdb6e05ba8"
/dev/sdf1: UUID="20211025-0200-5a7a-5a7a-beefcafebabe"
UUID_SUB="79a3dff4-c53f-9071-f9c1-c262403fbc10"
LABEL="GRANDSLAM:480" TYPE="linux_raid_member"
PARTLABEL="HIT*16TB*003*RAID5"
PARTUUID="7ec27f96-2275-4e09-9013-ac056f11ebfb"
/dev/md480: LABEL="MEGARAID" UUID="20211028-0500-5a7a-5a7a-beefcafebabe" TYPE="ext4"
############################################################### ENTRY FOR /ETC/FSTAB ##
/dev/md480 /MEGARAID ext4 nofail,noatime,nodiratime,relatime,errors=remount-ro
0 2
#################################################### ENTRY FOR /ETC/MDADM/MDADM.CONF ##
ARRAY /dev/md480 metadata=1.2 name=GRANDSLAM:480 UUID=20211025:02005a7a:5a7abeef:cafebabe
#######################################################################################
sudo update-initramfs -uv # -uv update ; verbose
sudo mount -a
sudo chown adminx:adminx -R /MEGARAID
############################################################### END 2021.10.28 0545A ##
On Mon, 2023-05-22 at 15:51 +0800, Yu Kuai wrote:
> Hi,
>
> 在 2023/05/22 14:56, raid 写道:
> > Hi,
> > Thanks for the guidance as the current state has at least changed somewhat.
> >
> > BTW Sorry about Life getting in the way of tech. =) Reason for my delayed response.
> >
> > -sudo mdadm -I /dev/sdc1
> > mdadm: /dev/sdc1 attached to /dev/md480, not enough to start (1).
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> > Version : 1.2
> > Raid Level : raid0
> > Total Devices : 1
> > Persistence : Superblock is persistent
> >
> > State : inactive
> > Working Devices : 1
> >
> > Delta Devices : 1, (-1->0)
> > New Level : raid5
> > New Layout : left-symmetric
> > New Chunksize : 512K
> >
> > Name : GRANDSLAM:480
> > UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > Events : 78714
> >
> > Number Major Minor RaidDevice
> >
> > - 8 33 - /dev/sdc1
> > -sudo mdadm -I /dev/sdd1
> > mdadm: /dev/sdd1 attached to /dev/md480, not enough to start (2).
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> > Version : 1.2
> > Raid Level : raid0
> > Total Devices : 2
> > Persistence : Superblock is persistent
> >
> > State : inactive
> > Working Devices : 2
> >
> > Delta Devices : 1, (-1->0)
> > New Level : raid5
> > New Layout : left-symmetric
> > New Chunksize : 512K
> >
> > Name : GRANDSLAM:480
> > UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > Events : 78714
> >
> > Number Major Minor RaidDevice
> >
> > - 8 49 - /dev/sdd1
> > - 8 33 - /dev/sdc1
> > -sudo mdadm -I /dev/sde1
> > mdadm: /dev/sde1 attached to /dev/md480, not enough to start (2).
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> > Version : 1.2
> > Raid Level : raid0
> > Total Devices : 3
> > Persistence : Superblock is persistent
> >
> > State : inactive
> > Working Devices : 3
> >
> > Delta Devices : 1, (-1->0)
> > New Level : raid5
> > New Layout : left-symmetric
> > New Chunksize : 512K
> >
> > Name : GRANDSLAM:480
> > UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > Events : 78712
> >
> > Number Major Minor RaidDevice
> >
> > - 8 65 - /dev/sde1
> > - 8 49 - /dev/sdd1
> > - 8 33 - /dev/sdc1
> > -sudo mdadm -I /dev/sdf1
> > mdadm: /dev/sdf1 attached to /dev/md480, not enough to start (3).
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> > Version : 1.2
> > Raid Level : raid0
> > Total Devices : 4
> > Persistence : Superblock is persistent
> >
> > State : inactive
> > Working Devices : 4
> >
> > Delta Devices : 1, (-1->0)
> > New Level : raid5
> > New Layout : left-symmetric
> > New Chunksize : 512K
> >
> > Name : GRANDSLAM:480
> > UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > Events : 78714
> >
> > Number Major Minor RaidDevice
> >
> > - 8 81 - /dev/sdf1
> > - 8 65 - /dev/sde1
> > - 8 49 - /dev/sdd1
> > - 8 33 - /dev/sdc1
> > -sudo mdadm -R /dev/md480
> > mdadm: failed to start array /dev/md480: Input/output error
> > ---
> > NOTE: Of additional interest...
> > ---
> > -sudo mdadm -D /dev/md480
> > /dev/md480:
> > Version : 1.2
> > Creation Time : Tue Oct 26 14:06:53 2021
> > Raid Level : raid5
> > Used Dev Size : 18446744073709551615
> > Raid Devices : 5
> > Total Devices : 3
> > Persistence : Superblock is persistent
> >
> > Update Time : Thu May 4 14:39:03 2023
> > State : active, FAILED, Not Started
> > Active Devices : 3
> > Working Devices : 3
> > Failed Devices : 0
> > Spare Devices : 0
> >
> > Layout : left-symmetric
> > Chunk Size : 512K
> >
> > Consistency Policy : unknown
> >
> > Delta Devices : 1, (4->5)
> >
> > Name : GRANDSLAM:480
> > UUID : 20211025:02005a7a:5a7abeef:cafebabe
> > Events : 78714
> >
> > Number Major Minor RaidDevice State
> > - 0 0 0 removed
> > - 0 0 1 removed
> > - 0 0 2 removed
> > - 0 0 3 removed
> > - 0 0 4 removed
> >
> > - 8 81 3 sync /dev/sdf1
> > - 8 49 1 sync /dev/sdd1
> > - 8 33 0 sync /dev/sdc1
>
> So the reason that this array can't start is that /dev/sde1 is not
> recognized as RaidDevice 2, and there are two RaidDevice missing for
> a raid5.
>
> Sadly I have no idea to workaroud this, sb metadate seems to be broken.
>
> Thanks,
> Kuai
> > ---
> > -watch -c -d -n 1 cat /proc/mdstat
> > ---
> > Every 1.0s: cat /proc/mdstat OAK2023: Mon May 22 01:48:24 2023
> >
> > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> > md480 : inactive sdf1[4] sdd1[1] sdc1[0]
> > 46877239294 blocks super 1.2
> >
> > unused devices: <none>
> > ---
> > Hopeful that is some progress towards an array start? It's definately unexpected output to me.
> > I/O Error starting md480
> >
> > Thanks!
> > SA
> >
> > On Thu, 2023-05-18 at 11:15 +0800, Yu Kuai wrote:
> >
> > > I have no idle why other disk shows that device 2 is missing, and what
> > > is device 4.
> > >
> > > Anyway, can you try the following?
> > >
> > > mdadm -I /dev/sdc1
> > > mdadm -D /dev/mdxxx
> > >
> > > mdadm -I /dev/sdd1
> > > mdadm -D /dev/mdxxx
> > >
> > > mdadm -I /dev/sde1
> > > mdadm -D /dev/mdxxx
> > >
> > > mdadm -I /dev/sdf1
> > > mdadm -D /dev/mdxxx
> > >
> > > If above works well, you can try:
> > >
> > > mdadm -R /dev/mdxxx, and see if the array can be started.
> > >
> > > Thanks,
> > > Kuai
> >
> >
> >
> > .
> >
next prev parent reply other threads:[~2023-05-22 19:51 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-17 13:26 RAID5 Phantom Drive Appeared while Reshaping Four Drive Array (HARDLOCK) raid
2023-05-17 23:45 ` Wol
2023-05-18 3:15 ` Yu Kuai
2023-05-22 6:56 ` raid
2023-05-22 7:51 ` Yu Kuai
2023-05-22 19:50 ` raid [this message]
2023-05-22 23:50 ` Roger Heflin
2023-05-23 5:04 ` raid
2023-05-22 7:20 ` raid
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3f3fd288e33cccdae32e3f26943218c773589849.camel@electrons.cloud \
--to=raid@electrons.cloud \
--cc=antlists@youngman.org.uk \
--cc=linux-raid@vger.kernel.org \
--cc=neilb@suse.com \
--cc=philip@turmel.org \
--cc=yukuai1@huaweicloud.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox