Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Pierre Martineau <pierre.martineau@inserm.fr>
To: linux-raid@vger.kernel.org
Subject: Re: RAID5 recovering
Date: Mon, 15 Apr 2013 17:58:40 +0200	[thread overview]
Message-ID: <516C23B0.3000003@inserm.fr> (raw)
In-Reply-To: <20130415151939.GA8383@cthulhu.home.robinhill.me.uk>

[-- Attachment #1: Type: text/plain, Size: 4956 bytes --]

Thanks a lot!
The array seems to start with only minor problems

mdadm: forcing event count in /dev/sdd1(3) from 112333 upto 112358
mdadm: clearing FAULTY flag for device 1 in /dev/md0 for /dev/sdd1
mdadm: /dev/md0 has been started with 3 drives (out of 4).

File systems are corrupted but not too seriously.
I will have a look for RAID6 in the future.

Thanks again,
Pierre

Pierre MARTINEAU

Institut de Recherche en Cancérologie de Montpellier
Inserm U896 – Université Montpellier 1 – CRLC Val d’Aurelle
Campus Val d’Aurelle
208 Rue des Apothicaires
F-34298 Montpellier Cedex 5, France

Tel: +33 (0)4 67 61 37 43
Fax: +33 (0)4 67 61 37 87
E-mail: pierre.martineau@inserm.fr
E-mail: pierre.martineau@montpellier.unicancer.fr
Site internet: http://www.ircm.fr

Le 15/04/2013 17:19, Robin Hill a écrit :
> On Mon Apr 15, 2013 at 03:47:39PM +0200, Pierre Martineau wrote:
>
>> Dear Raid experts,
>>
>> I have a Raid5 volume that recently crashed and I need you advices
>> before doing some irreversible action.
>>
>> Let me first summarize the past and current state.
>>
>> 1) I had a nicely running RAID5 volume with 3 x 1 To disks (LVM on top
>> and several LVM volumes in ext3 and axt4) but volume was now a bit too
>> small and I decided to add a new 1 To disk.
>>
> Given the rebuild time for a 1To disk, I'd be wary of running RAID5 - if
> you have the space, adding another disk and going to RAID6 will be much
> safer.
>
>> 2) I added a new disk and did not do anything for a couple of days (Raid
>> still running with 3 disks)
>>
>> 3) One of the old disk failed and was ejected from the RAID.
>>
>> 4) The ejected disk was not even present as /dev/sdX. I thus tested the
>> connections and the disk came back.
>>
>> 5) I resync the ejected disk and I was back with my original 3 disk array.
>>
>> 6) I waited 2-3 days and everything was fine. I then added the new disk
>> and resync.
>>
>> 7) I had now a running 4 disk RAID5 array, I created a new volume and
>> started copying on it.
>>
>> 8) During the week-end, 2 disks were ejected from the array, the new
>> installed one and the same than previously (step 3)
>>
>> 9) Again the 2 disks were not present in /dev/sdX. I thus checked again
>> the connections and the problem was a molex connector. The two ejected
>> disks were on the same molex and this explains why both were detected as
>> faulty.
>>
>> Now, my list of errors as a newbie.
>>
>> 4) I did not save all the informations before proceeding (mdadm
>> --examine, /etc/mdadm/mdadm.conf, syslog, ...)
>>
>> 5) I tried to assemble the disks with
>> mdadm --assemble --scan
>> with no result
>>
>> 6) I thus tried and this is my big error I think !!!
>> mdadm --assemble /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1
>>
>> I forgot in this command /dev/md0 after assemble.
>> Because of this /dev/sdb1 suberblock was removed and now mdadm--examine
>> /dev/sdb1 returns "No md superblock detected on /dev/sdb1"
>>
>> I would like now to be more cautious. If some nice expert from the list
>> would be nice enough to tell me if the proposed method described below
>> is the right approach I will be grateful for the rest of my life :-)
>>
>> 7) I read the RAID wiki and the list.
>>
>> 8) I saved
>> mdadm --examine /dev/sd[bcde]1
>> dmesg
>> syslog
>> /etc/mdadm/mdadm.conf
>> fdisk -lu /dev/sd[bcde]
>>
>> I put the content of this files at the end of this message (except dmesg
>> and syslog because they are very long).
>>
>> 9) /dev/sdd is the new disk. This is clear in the fdisk listing since it
>> is a 4K sector disk.
>> The normal order of the raid is thus (see mdadm --examine /dev/sd[de]1)
>> sdb1 sdc1 sde1 sdd1
>>
>> 10) Events are
>> /dev/sdb1: no md superblock (see 6)
>> /dev/sdc1: Events : 112358
>> /dev/sdd1: Events : 112333
>> /dev/sde1: Events : 112358
>>
>> It seems that sdd was the first disk removed.
>> Presumably sdb1 is in sync since it was running with sdc1 when the sdd1
>> and sde1 were ejected from the array (see 8) but I can't be sure since I
>> stupidly erased its superblock!
>>
>> 11) I propose to re-create the array with the --assume-clean option,
>> then check everything using "fsck -n" and "mount -o ro"
>> the command would be:
>>
>> mdadm --create /dev/md0 -e 0.90 --assume-clean --level=5 --n=4 \
>> --chunk=64 --size=976759936 /dev/sdb1 /dev/sdc1 /dev/sde1 /dev/sdd1
>>
> <-- snip -->
>
> Have you tried to force assemble the array first? Recreating the array
> is a risky option, so should be avoided if possible. First try doing:
>    mdadm -Af /dev/md0 /dev/sd[cde]1
>
> If that works then you'll need to re-add (and rebuild) /dev/sdb1. If it
> doesn't work, try rerunning (after making sure the array is stopped) and
> adding "-vvv" for extra verbosity, then send through the output from
> that and anything relevant from dmesg.
>
> HTH,
>      Robin


[-- Attachment #2: pierre_martineau.vcf --]
[-- Type: text/x-vcard, Size: 329 bytes --]

begin:vcard
fn:Pierre MARTINEAU
n:MARTINEAU;Pierre
org:INSERM U896;IRCM
adr:208 rue des Apothicaires;;CRLC Val d'Aurelle-Paul Lamarque;Montpellier;;34298;France
email;internet:pierre.martineau@inserm.fr
tel;work:+33 (0)4 67 61 37 43
tel;fax:+33 (0)4 67 61 37 87
x-mozilla-html:FALSE
url:http://www.ircm.fr
version:2.1
end:vcard


  parent reply	other threads:[~2013-04-15 15:58 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-15 13:47 RAID5 recovering Pierre Martineau
2013-04-15 15:19 ` Robin Hill
2013-04-15 15:49   ` Oliver Schinagl
2013-04-15 15:58   ` Pierre Martineau [this message]
2013-04-16  8:30   ` Roman Mamedov
2013-04-16 16:41     ` Roy Sigurd Karlsbakk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=516C23B0.3000003@inserm.fr \
    --to=pierre.martineau@inserm.fr \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox