Re: RAID5 disk failure during rebuild of spare, any chance of recovery when one of the failed devices is suspected to be intact?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Nicolas Jungers <nicolas@jungers.net>
To: "Tor Arne Vestbø" <torarnv@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: RAID5 disk failure during rebuild of spare, any chance of recovery when one of the failed devices is suspected to be intact?
Date: Mon, 16 Aug 2010 18:37:56 +0200	[thread overview]
Message-ID: <4C696964.7030205@jungers.net> (raw)
In-Reply-To: <AANLkTinaGGH6H2cj3tXR8m0yef9YE65R4cHrQGoxwJfY@mail.gmail.com>

On 08/16/2010 06:27 PM, Tor Arne Vestbø wrote:
> On Mon, Aug 16, 2010 at 10:43 AM, Tim Small<tim@seoss.co.uk>  wrote:
>> On 16/08/10 07:12, Nicolas Jungers wrote:
>>>
>>> On 08/16/2010 07:54 AM, Tor Arne Vestbø wrote:
>>>>
>>>> You mean you sdc and sde plus either sdb or sdd, depending on which
>>>> one I think is more sane a this point?
>>>
>>> I'd try both.  Do a ddrescue of the failing one and try that (with copy of
>>> the others) and check what's coming out.
>>
>> As an alternative to using ddrescue, you could quickly prototype various
>> arrangements (without writing anything to the drives) using a device-mapper
>> copy-on-write mapping - I posted some details to the list a while back when
>> I was trying to use this to reconstruct a hw raid array...  Check the list
>> archives for details.
>
> Cool, here's what I tried:
>
> Created spares files for each of the devices
>
>    dd if=/dev/zero of=sdb_cow bs=1 count=0 seek=2GB
>
> Mapped that to a loop device
>
>    losetup /dev/loop1 sdb_cow
>
> Then ran the following for each device:
>
>    cow_size=`blockdev --getsize /dev/sdb1`
>    chunk_size=64
>    echo "0 $cow_size snapshot /dev/sdb1 /dev/loop1 p $chunk_size" |
> dmsetup create sdb1_cow
>
> After these were created I tried the following:
>
> # mdadm -v -C /dev/md0 -l5 -n4 /dev/mapper/sdb1_cow
> /dev/mapper/sdc1_cow missing /dev/mapper/sde1_cow
> mdadm: layout defaults to left-symmetric
> mdadm: chunk size defaults to 64K
> mdadm: /dev/mapper/sdb1_cow appears to be part of a raid array:
>      level=raid5 devices=4 ctime=Sun Mar  2 22:52:53 2008
> mdadm: /dev/mapper/sdc1_cow appears to be part of a raid array:
>      level=raid5 devices=4 ctime=Sun Mar  2 22:52:53 2008
> mdadm: /dev/mapper/sde1_cow appears to be part of a raid array:
>      level=raid5 devices=4 ctime=Sun Mar  2 22:52:53 2008
> mdadm: size set to 732571904K
> Continue creating array? Y
> mdadm: array /dev/md0 started.
>
> # mdadm --detail /dev/md0
> /dev/md0:
>          Version : 00.90
>    Creation Time : Mon Aug 16 18:20:06 2010
>       Raid Level : raid5
>       Array Size : 2197715712 (2095.91 GiB 2250.46 GB)
>    Used Dev Size : 732571904 (698.64 GiB 750.15 GB)
>     Raid Devices : 4
>    Total Devices : 3
> Preferred Minor : 0
>      Persistence : Superblock is persistent
>
>      Update Time : Mon Aug 16 18:20:06 2010
>            State : clean, degraded
>   Active Devices : 3
> Working Devices : 3
>   Failed Devices : 0
>    Spare Devices : 0
>
>           Layout : left-symmetric
>       Chunk Size : 64K
>
>             UUID : 916ceaa2:b877a3cc:3973abef:31f2d600 (local to host monstre)
>           Events : 0.1
>
>      Number   Major   Minor   RaidDevice State
>         0     251        9        0      active sync   /dev/block/251:9
>         1     251       10        1      active sync   /dev/block/251:10
>         2       0        0        2      removed
>         3     251       12        3      active sync   /dev/block/251:12
>
> And I can now mount /dev/mapper/raid-home !
>
> The question now is, what next? Should I start copying things off to a
> backup, or run fsck first or something else to try to repair errors?
> Or perhaps are the 2GB sparse files to small for anything like that?

For me: first, copy everything.  You have an unreliable disk in the 
middle of your data.

N.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2010-08-16 16:37 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-15 18:52 RAID5 disk failure during rebuild of spare, any chance of recovery when one of the failed devices is suspected to be intact? Tor Arne Vestbø
2010-08-15 20:06 ` Tor Arne Vestbø
2010-08-15 22:33   ` Tor Arne Vestbø
2010-08-16  5:29     ` Nicolas Jungers
2010-08-16  5:59       ` Tor Arne Vestbø
     [not found]       ` <AANLkTim9gUa95AR1KZcyBp7qM8_PeO1O7Bh99R2P8ON9@mail.gmail.com>
2010-08-16  6:12         ` Nicolas Jungers
2010-08-16  8:43           ` Tim Small
2010-08-16 16:27             ` Tor Arne Vestbø
2010-08-16 16:37               ` Nicolas Jungers [this message]
2010-08-16 12:13           ` Tor Arne Vestbø
2010-08-16  5:49     ` Tor Arne Vestbø

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C696964.7030205@jungers.net \
    --to=nicolas@jungers.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=torarnv@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.