From: Neil Brown <neilb@suse.de>
To: Steve Evans <jeeping@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: mdadm degraded RAID5 failure
Date: Sat, 25 Oct 2008 17:30:12 +1100 [thread overview]
Message-ID: <18690.48372.369694.309381@notabene.brown> (raw)
In-Reply-To: message from Steve Evans on Wednesday October 22
On Wednesday October 22, jeeping@gmail.com wrote:
> Hi all..
Hi.
You need to get a mail client that doesn't destroy the formatting of
the text that you paste in. But while it is an inconvenience, we
should be able to persevere...
>
> I had one of the disks in my 3 disk RAID5 die on me this week. When
> attempting to replace the disk via a hot swap (USB), the RAID didn't
> like it. It decided to mark one of my remaining 2 disks as faulty.
It would be interesting to see the kernel logs at this time. Maybe
the USB bus glitched while you were plugging the device in.
>
> Can someone *please* help me get the raid back!?
Probably.
>
> More details -
>
> Drives are /dev/sdb1, /dev/sdc1 & /dev/sdd1
... or were. USB device names can change every time you plug them in.
>
> sdc1 was the one that died earlier this week
> sdb1 appears to be the one that was marked as faulty
>
> mdadm detail before sdc1 was plugged in -
>
> root@imp[~]:11 # mdadm --detail /dev/md1
> /dev/md1:
...
>
> Number Major Minor RaidDevice State
> 0 8 17 0 active sync /dev/sdb1
> 1 0 0 - removed
> 2 8 49 2 active sync /dev/sdd1
So the array thinks the 2nd of 3 is missing. That is consistent with
your description.
>
>
> then after plugging in the replacement sdc1 -
>
> root@imp[~]:13 # mdadm --add /dev/md1 /dev/sdc1
> mdadm: hot added /dev/sdc1
> root@imp[~]:14 #
> root@imp[~]:14 #
> root@imp[~]:14 # mdadm --detail /dev/md1
> /dev/md1:
...
>
> Number Major Minor RaidDevice State
> 0 0 0 - removed
> 1 0 0 - removed
> 2 8 49 2 active sync /dev/sdd1
>
> 3 8 33 0 spare rebuilding /dev/sdc1
> 4 8 17 - faulty /dev/sdb1
Yes, sdb must have got an error and failed while sdc was rebuilding.
Sad. That suggests that it didn't fail at the moment of USB
insertion, but a little later. Not conclusively though.
>
> Shortly after this, subsequent mdadm --details stopped responding.. So
> I rebooted in the hope I could reset and problems with the hot add..
>
> Now, I'm unable to assemble the raid with the 2 working drives -
>
> mdadm --assemble /dev/md1 /dev/sdb1 /dev/sdd1
>
> doesn't work -
>
> mdadm: /dev/md1 assembled from 1 drive and 1 spare - not enough to
> start the array.
You have rebooted so device names may have changed.
If it thought you had named a good drive and a spare, it probably saw
the device that was originally sdb (and possibly still is)
and the device that was originally sdc (and now might be sdd).
>
> mdadm --assemble --force /dev/md1 /dev/sdb1 /dev/sdd1
>
> doesn't' work either
What error messages? Always best to be explicit.
Adding "-v" to the --assemble line would help too.
>
> This -
>
> mdadm --assemble --force --run /dev/md1 /dev/sdb1 /dev/sdd1
>
> Did work partially -
>
Hmm.. That really shouldn't have worked. The kernel should have
rejected the array...
>
> Here's the output from mdadm -E on each of the 2 drives -
Uhm... There should be 3 drives?
The 'good' one, the 'new' one, and the one that seemed to fail
immediately after you plugged in the 'new' one.
>
> /dev/sdb1:
..
> Number Major Minor RaidDevice State
> this 3 8 33 3 spare /dev/sdc1
>
> 0 0 0 0 0 removed
> 1 1 0 0 1 faulty removed
> 2 2 8 49 2 active sync /dev/sdd1
> 3 3 8 33 3 spare /dev/sdc1
sdb looks like the new one.
> /dev/sdd1:
...
>
> Number Major Minor RaidDevice State
> this 2 8 49 2 active sync /dev/sdd1
>
> 0 0 0 0 0 removed
> 1 1 0 0 1 faulty removed
> 2 2 8 49 2 active sync /dev/sdd1
> 3 3 8 33 0 spare /dev/sdc1
sdd looks like the good one.
Where is the "one that seemed to fail" which was once called sdb ??
>
> Is all the data lost, or can I recover from this?
Try
mdadm --examine --brief --verbose /dev/sd*
That will list anything that looks like an array.
e.g. (on my devel machine)
# mdadm --examine --brief --verbose /dev/sd*
ARRAY /dev/md0 level=raid5 num-devices=3 UUID=cfd6a841:c24600be:c4297cb4:f8ef633e
devices=/dev/sdb,/dev/sdc,/dev/sdd
ARRAY /dev/md0 level=raid5 num-devices=2 UUID=cb711aad:db89ffc8:faa4816a:59e602da
devices=/dev/sda11,/dev/sda12
Take careful note of the "devices=" part. That lists sets of devices
(maybe only one set in your case) which are all part of an array.
So I have two array, one across /dev/sdb, /dev/sdc, /dev/sdd and
one across /dev/sda11 and /dev/sda12.
Then
mdadm --assemble --force --verbose /dev/md1 /dev/sd....
where you list all the devices in the device= section for the array
you want to try to start.
Report the output of that command and whether it was successful.
NeilBrown
next prev parent reply other threads:[~2008-10-25 6:30 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <6cc8e9ed0810221350o2b8b3aedm3d1c229fe7e66163@mail.gmail.com>
2008-10-22 20:52 ` mdadm degraded RAID5 failure Steve Evans
2008-10-24 18:47 ` Steve Evans
2008-10-25 6:30 ` Neil Brown [this message]
2008-10-25 10:44 ` David Greaves
2008-10-29 22:16 ` Steve Evans
2008-11-04 21:35 ` Steve Evans
2008-11-06 5:41 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18690.48372.369694.309381@notabene.brown \
--to=neilb@suse.de \
--cc=jeeping@gmail.com \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).