From: Wols Lists <antlists@youngman.org.uk>
To: David T-G <davidtg-robot@justpickone.org>,
Linux RAID list <linux-raid@vger.kernel.org>
Subject: Re: why won't this RAID5 device start?
Date: Sun, 28 Mar 2021 20:27:38 +0100 [thread overview]
Message-ID: <6060D8AA.9030504@youngman.org.uk> (raw)
In-Reply-To: <20210328021451.GB1415@justpickone.org>
On 28/03/21 03:14, David T-G wrote:
> Hi, all --
>
> I recently migrated our disk farm to a new box with a new OS build
> (openSuSE from KNOPPIX). Aside from the usual challenges of setting
> up the world again, I have a 3-device RAID5 volume that won't start.
> The other metadevice is fine, though; I think we can say that the md
> system is running. Soooooo ... Where do I start?
>
> diskfarm:~ # cat /proc/mdstat
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid5 sdc1[3] sdd1[4] sdb1[0]
> 11720265216 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [U_UU]
This looks wrong - is this supposed to be a four-drive array? U_UU
implies a missing drive ... as does [4/3]
>
> md127 : inactive sdl2[0](S) sdj2[3](S) sdf2[1](S)
> 2196934199 blocks super 1.2
>
> unused devices: <none>
>
> [No, I don't know why md127 was first a moment ago!] I tried reassembling
> the device, but mdadm doesn't like it.
>
> diskfarm:~ # mdadm --stop /dev/md127
> mdadm: stopped /dev/md127
> diskfarm:~ # mdadm --assemble --scan
> mdadm: /dev/md/750Graid5md assembled from 1 drive - not enough to start the array.
> mdadm: Found some drive for an array that is already active: /dev/md/0
> mdadm: giving up.
> mdadm: No arrays found in config file or automatically
>
> But ... what's with just 1 drive?
>
> diskfarm:~ # for D in /dev/sd[fjl] ; do parted $D print ; done
> Model: ATA WDC WD7500BPKX-7 (scsi)
> Disk /dev/sdf: 750GB
> Sector size (logical/physical): 512B/4096B
> Partition Table: gpt
> Disk Flags:
>
> Number Start End Size File system Name Flags
> 2 1049kB 750GB 750GB ntfs Linux RAID raid
> 3 750GB 750GB 134MB ext3 Linux filesystem
>
> Model: ATA WDC WD7500BPKX-7 (scsi)
> Disk /dev/sdj: 750GB
> Sector size (logical/physical): 512B/4096B
> Partition Table: gpt
> Disk Flags:
>
> Number Start End Size File system Name Flags
> 2 1049kB 750GB 750GB ntfs Linux RAID raid
> 3 750GB 750GB 134MB xfs Linux filesystem
>
> Model: ATA Hitachi HDE72101 (scsi)
> Disk /dev/sdl: 1000GB
> Sector size (logical/physical): 512B/512B
> Partition Table: msdos
> Disk Flags:
>
> Number Start End Size Type File system Flags
> 1 1049kB 4227MB 4226MB primary ntfs diag, type=27
> 2 4227MB 754GB 750GB primary reiserfs raid, type=fd
> 3 754GB 754GB 134MB primary reiserfs type=83
> 4 754GB 1000GB 246GB primary reiserfs type=83
>
> Slice 2 on each is the RAID partition, slice 3 on each is a little
> filesystem for bare-bones info, and slice 4 on sdl is a normal basic
> filesystem for scratch content.
>
> diskfarm:~ # mdadm --examine /dev/sd[fjl]2 | egrep '/dev|Name|Role|State|Checksum|Events|UUID'
> /dev/sdf2:
> Array UUID : 88575f01:592167fd:bd9f9ba1:a61fafc4
> Name : diskfarm:750Graid5md (local to host diskfarm)
> State : clean
> Device UUID : e916fc67:b8b7fc59:51440134:fa431d02
> Checksum : 43f9e7a4 - correct
> Events : 720
> Device Role : Active device 1
> Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdj2:
> Array UUID : 88575f01:592167fd:bd9f9ba1:a61fafc4
> Name : diskfarm:750Graid5md (local to host diskfarm)
> State : clean
> Device UUID : 0b847f84:83e80a3d:a0dc11e7:60bffc9f
> Checksum : 9522782b - correct
> Events : 177792
> Device Role : Active device 2
> Array State : A.A ('A' == active, '.' == missing, 'R' == replacing)
> /dev/sdl2:
> Array UUID : 88575f01:592167fd:bd9f9ba1:a61fafc4
> Name : diskfarm:750Graid5md (local to host diskfarm)
> State : clean
> Device UUID : cc53440e:cb9180e4:be4c38d4:88a676eb
> Checksum : fef95256 - correct
> Events : 177794
> Device Role : Active device 0
> Array State : A.. ('A' == active, '.' == missing, 'R' == replacing)
>
> Slice f2 looks great, but slices j2 & l2 seem to be missing -- even though
> they are present. Worse, the Events counter on sdf2 is frighteningly
> small. Where did it go?!? So maybe I consider sdf2 failed and reassemble
> from the other two [only] and then put f2 back in?
Yes I'm afraid so. I'd guess you've been running a failed raid-5 for
ages, and because something hiccuped when you shut it down, the two good
drives drifted apart, and now they won't start ...
>
> Definitely time to stop, take a deep breath, and ask for help :-)
>
Read up about overlays on the web site, use an overlay to force-assemble
the two good drives, run fsck etc to check everything's good (you might
lose a bit of data, hopefully very little), then if everything looks
okay do it for real. Ie force-assemble the two good drives, then re-add
the third. I'd just do a quick smartctl health check on all three drives
first, just to make sure nothing is obviously wrong with them - a
problem could kill your array completely! Then add the third drive back
in (whether you use --add or --re-add probably won't make much difference).
Oh - and fix whatever is wrong with md0, too, before that dies on you!
Cheers,
Wol
next prev parent reply other threads:[~2021-03-28 21:04 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-28 2:14 why won't this RAID5 device start? David T-G
2021-03-28 4:36 ` Wols Lists
2021-03-28 19:27 ` Wols Lists [this message]
2021-03-30 3:13 ` David T-G
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6060D8AA.9030504@youngman.org.uk \
--to=antlists@youngman.org.uk \
--cc=davidtg-robot@justpickone.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.