From: Robin Hill <robin@robinhill.me.uk>
To: Fabio Bacigalupo <info1@open-haus.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: Seeking help to get a failed RAID5 system back to life
Date: Fri, 29 Aug 2014 08:46:27 +0100 [thread overview]
Message-ID: <20140829074627.GA8321@cthulhu.home.robinhill.me.uk> (raw)
In-Reply-To: <CAGgRf0Qe1EU6wZm4BbWqs0diwwnBCeStiVAXaOdHV88Ttn=vfw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 7077 bytes --]
On Fri Aug 29, 2014 at 04:07:40AM +0200, Fabio Bacigalupo wrote:
> Hello,
>
> I have been trying all night to get my system back to work. One of the
> two remaining hard-drives suddenly stopped working today. I read and
> tried everything I could find that seemed to not make things worse
> than they are. Finally I stumbled upon this page [1] on the Linux Raid
> wiki which recommends to consult this mailing list.
>
> I had a RAID 5 installation with three disks but disk 0 (I assume as
> it was /dev/sda3) has been taken out for a while. The disks reside in
> a remote server.
>
That's a disaster waiting to happen. You should never leave a RAID array
in a degraded state for any longer than is absolutely necessary,
otherwise you might as well not bother running RAID at all.
> Sorry if this is obvious to you but I am totally stuck. I always run
> into dead ends.
>
> Your help is very much appreciated!
>
> Thank you for any hints,
> Fabio
>
> I could gather the following information:
>
> ================================================================================
>
> # mdadm --examine /dev/sd*3
> mdadm: No md superblock detected on /dev/sda3.
> /dev/sdb3:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : f07f4bc6:36864b49:776c2c25:004bd7b2
> Creation Time : Wed May 4 08:18:11 2011
> Raid Level : raid5
> Used Dev Size : 1462766336 (1395.00 GiB 1497.87 GB)
> Array Size : 2925532672 (2790.01 GiB 2995.75 GB)
> Raid Devices : 3
> Total Devices : 1
> Preferred Minor : 127
>
> Update Time : Thu Aug 28 19:55:59 2014
> State : clean
> Active Devices : 1
> Working Devices : 1
> Failed Devices : 1
> Spare Devices : 0
> Checksum : 490fa722 - correct
> Events : 68856340
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 1 8 19 1 active sync /dev/sdb3
>
> 0 0 0 0 0 removed
> 1 1 8 19 1 active sync /dev/sdb3
> 2 2 0 0 2 faulty removed
> /dev/sdc3:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : f07f4bc6:36864b49:776c2c25:004bd7b2
> Creation Time : Wed May 4 08:18:11 2011
> Raid Level : raid5
> Used Dev Size : 1462766336 (1395.00 GiB 1497.87 GB)
> Array Size : 2925532672 (2790.01 GiB 2995.75 GB)
> Raid Devices : 3
> Total Devices : 2
> Preferred Minor : 127
>
> Update Time : Thu Aug 28 19:22:19 2014
> State : active
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
> Checksum : 44f4f557 - correct
> Events : 68856326
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 2 8 35 2 active sync /dev/sdc3
>
> 0 0 0 0 0 removed
> 1 1 8 19 1 active sync /dev/sdb3
> 2 2 8 35 2 active sync /dev/sdc3
>
>
> ================================================================================
>
> # mdadm --examine /dev/sd[b]
> /dev/sdb:
> MBR Magic : aa55
> Partition[0] : 4737024 sectors at 2048 (type 83)
> Partition[2] : 2925532890 sectors at 4739175 (type fd)
>
>
> ================================================================================
>
> Disk /dev/sdc has been replaced with a new hard drive as the old one
> had input/output errors.
>
Are the above --examine results from before or after the replacement?
Was the old /dev/sdc data replicated onto the replacement disk?
> I assume this is weired and showed /dev/sdb3 before (changing things):
>
> # cat /proc/mdstat
> Personalities : [raid1]
> unused devices: <none>
>
> I tried to copy the structure from /dev/sdb to /dev/sdc which assumably work:
>
This shouldn't be needed if the old disk was replicated before being
replaced.
> # sgdisk -R /dev/sdc /dev/sdb
>
> ***************************************************************
> Found invalid GPT and valid MBR; converting MBR to GPT format
> in memory.
> ***************************************************************
>
> The operation has completed successfully.
>
> # sgdisk -G /dev/sdc
>
> The operation has completed successfully.
>
> # fdisk -l
>
> -- Removed /dev/sda --
>
> Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes, 2930277168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk label type: dos
> Disk identifier: 0x0005fb16
>
> Device Boot Start End Blocks Id System
> /dev/sdb1 2048 4739071 2368512 83 Linux
> /dev/sdb3 * 4739175 2930272064 1462766445 fd Linux raid autodetect
> WARNING: fdisk GPT support is currently new, and therefore in an
> experimental phase. Use at your own discretion.
>
> Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes, 2930277168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk label type: gpt
>
> # Start End Size Type Name
> 1 2048 4739071 2.3G Linux filesyste Linux filesystem
> 3 4739175 2930272064 1.4T Linux RAID Linux RAID
>
>
> # mdadm --assemble /dev/md127 /dev/sd[bc]3
> mdadm: no RAID superblock on /dev/sdc3
> mdadm: /dev/sdc3 has no superblock - assembly aborted
>
> # mdadm --assemble /dev/md127 /dev/sd[b]3
> mdadm: /dev/md127 assembled from 1 drive - not enough to start the array.
>
> # mdadm --misc -QD /dev/sd[bc]3
> mdadm: /dev/sdb3 does not appear to be an md device
> mdadm: /dev/sdc3 does not appear to be an md device
>
> # mdadm --detail /dev/md127
> /dev/md127:
> Version :
> Raid Level : raid0
> Total Devices : 0
>
> State : inactive
>
> Number Major Minor RaidDevice
>
>
> [1] https://raid.wiki.kernel.org/index.php/RAID_Recovery
If the initial --examine results were done on the same disks as the
--assemble then I'm rather confused as to why mdadm would find a
superblock for one and not for the other. Could you post the mdadm and
kernel versions - possibly there's a bug that's been fixed in newer
releases.
If the --examine was on the old disk and this wasn't replicated onto the
new one then I'm not sure what you're expecting to happen here - you've
lost 2 disks in a 3-disk RAID-5 so your data is now toast.
Cheers,
Robin
--
___
( ' } | Robin Hill <robin@robinhill.me.uk> |
/ / ) | Little Jim says .... |
// !! | "He fallen in de water !!" |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
next prev parent reply other threads:[~2014-08-29 7:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-29 2:07 Seeking help to get a failed RAID5 system back to life Fabio Bacigalupo
2014-08-29 7:46 ` Robin Hill [this message]
2014-08-29 8:55 ` Fabio Bacigalupo
2014-08-29 9:10 ` Robin Hill
2014-08-31 9:12 ` Fabio Bacigalupo
2014-08-31 11:15 ` Robin Hill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140829074627.GA8321@cthulhu.home.robinhill.me.uk \
--to=robin@robinhill.me.uk \
--cc=info1@open-haus.de \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).