linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Dietrich Heise <dh@dhde.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: RAID5 faild while in degraded mode, need help
Date: Mon, 9 Jul 2012 10:12:08 +1000	[thread overview]
Message-ID: <20120709101208.49528808@notabene.brown> (raw)
In-Reply-To: <CACMY3vC=iAbKCssHH8CQ=msqNtmE1zF1muoZTusfRd989HvD_w@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7071 bytes --]

On Sun, 8 Jul 2012 21:05:02 +0200 Dietrich Heise <dh@dhde.de> wrote:

> Hi,
> 
> the following Problem,
> One of four drives has S.M.A.R.T. errors, so I removed it and
> replaced, with a new one.
> 
> In the time the drive was rebuilding, one of the three left devices
> has an I/O error (sdd1) (sdc1 was the replaced drive an was syncing).
> 
> Now the following happends (two drives are spare drives :( )

It looks like you tried to --add /dev/sdd1 back in after it failed, and mdadm
let new.  Newer versions of mdadm will refuse as that is not a good thing to
do but it shouldn't stop you getting your data back.

First thing to realise is that you could have data corruption.  There is at
least one block in the array which cannot be recovered, possibly more.  i.e.
any block on sdd1 which is bad, and any block at the same offset in sdc1.
These blocks may not be in files which would be lucky, or they may contain
important metadata which might mean you've lost lots of files.

If you hadn't tried to --add /dev/sdd1 you could just force-assemble the
array back to degraded mode (without sdc1) and back up any critical data.
As sdd1 now thinks it is a spare you need to re-create the array instead:

 mdadm -S /dev/md1
 mdadm -C /dev/md1 -l5 -n4 -e 1.2 -c 512 /dev/sdf1 /dev/sde1 /dev/sdd1 missing
or
 mdadm -C /dev/md1 -l5 -n4 -e 1.2 -c 512 /dev/sdf1 /dev/sde1 missing /dev/sdd1

depending on whether sdd1 as the 3rd or 4th device in the array - I cannot
tell from the output here.

You should then be able to mount the array and backup stuff.

You then want to use 'ddrescue' to copy sdd1 onto a device with no bad
blocks, and assemble  the array using the device instead of sdd1.

Finally, you can add the new spare (sdc1) to the array and it should rebuild
successfully - providing there are no bad blocks on sdf1 or sde1.

I hope that makes sense.  Do ask if anything is unclear.

NeilBrown


> 
> p3 disks # mdadm -D /dev/md1
> /dev/md1:
>         Version : 1.2
>   Creation Time : Mon Feb 28 19:57:56 2011
>      Raid Level : raid5
>   Used Dev Size : 1465126400 (1397.25 GiB 1500.29 GB)
>    Raid Devices : 4
>   Total Devices : 4
>     Persistence : Superblock is persistent
> 
>     Update Time : Sun Jul  8 20:37:12 2012
>           State : active, FAILED, Not Started
>  Active Devices : 2
> Working Devices : 4
>  Failed Devices : 0
>   Spare Devices : 2
> 
>          Layout : left-symmetric
>      Chunk Size : 512K
> 
>            Name : p3:0  (local to host p3)
>            UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362
>          Events : 121205
> 
>     Number   Major   Minor   RaidDevice State
>        0       8       81        0      active sync   /dev/sdf1
>        1       8       65        1      active sync   /dev/sde1
>        2       0        0        2      removed
>        3       0        0        3      removed
> 
>        4       8       49        -      spare   /dev/sdd1
>        5       8       33        -      spare   /dev/sdc1
> 
> here is more information:
> 
> p3 disks # mdadm -E /dev/sdc1
> /dev/sdc1:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362
>            Name : p3:0  (local to host p3)
>   Creation Time : Mon Feb 28 19:57:56 2011
>      Raid Level : raid5
>    Raid Devices : 4
> 
>  Avail Dev Size : 2930275057 (1397.26 GiB 1500.30 GB)
>      Array Size : 8790758400 (4191.76 GiB 4500.87 GB)
>   Used Dev Size : 2930252800 (1397.25 GiB 1500.29 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : caefb029:526187ef:2051f578:db2b82b7
> 
>     Update Time : Sun Jul  8 20:37:12 2012
>        Checksum : 18e2bfe1 - correct
>          Events : 121205
> 
>          Layout : left-symmetric
>      Chunk Size : 512K
> 
>    Device Role : spare
>    Array State : AA.. ('A' == active, '.' == missing)
> p3 disks # mdadm -E /dev/sdd1
> /dev/sdd1:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362
>            Name : p3:0  (local to host p3)
>   Creation Time : Mon Feb 28 19:57:56 2011
>      Raid Level : raid5
>    Raid Devices : 4
> 
>  Avail Dev Size : 2930269954 (1397.26 GiB 1500.30 GB)
>      Array Size : 8790758400 (4191.76 GiB 4500.87 GB)
>   Used Dev Size : 2930252800 (1397.25 GiB 1500.29 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : 4231e244:60e27ed4:eff405d0:2e615493
> 
>     Update Time : Sun Jul  8 20:37:12 2012
>        Checksum : 4bec6e25 - correct
>          Events : 0
> 
>          Layout : left-symmetric
>      Chunk Size : 512K
> 
>    Device Role : spare
>    Array State : AA.. ('A' == active, '.' == missing)
> p3 disks # mdadm -E /dev/sde1
> /dev/sde1:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362
>            Name : p3:0  (local to host p3)
>   Creation Time : Mon Feb 28 19:57:56 2011
>      Raid Level : raid5
>    Raid Devices : 4
> 
>  Avail Dev Size : 2930253889 (1397.25 GiB 1500.29 GB)
>      Array Size : 8790758400 (4191.76 GiB 4500.87 GB)
>   Used Dev Size : 2930252800 (1397.25 GiB 1500.29 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : 28b08f44:4cc24663:84d39337:94c35d67
> 
>     Update Time : Sun Jul  8 20:37:12 2012
>        Checksum : 15faa8a1 - correct
>          Events : 121205
> 
>          Layout : left-symmetric
>      Chunk Size : 512K
> 
>    Device Role : Active device 1
>    Array State : AA.. ('A' == active, '.' == missing)
> p3 disks # mdadm -E /dev/sdf1
> /dev/sdf1:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : 6d4ebfd4:491bcb50:d98d5e67:f226f362
>            Name : p3:0  (local to host p3)
>   Creation Time : Mon Feb 28 19:57:56 2011
>      Raid Level : raid5
>    Raid Devices : 4
> 
>  Avail Dev Size : 2930269954 (1397.26 GiB 1500.30 GB)
>      Array Size : 8790758400 (4191.76 GiB 4500.87 GB)
>   Used Dev Size : 2930252800 (1397.25 GiB 1500.29 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : active
>     Device UUID : 78d5600a:91927758:f78a1cea:3bfa3f5b
> 
>     Update Time : Sun Jul  8 20:37:12 2012
>        Checksum : 7767cb10 - correct
>          Events : 121205
> 
>          Layout : left-symmetric
>      Chunk Size : 512K
> 
>    Device Role : Active device 0
>    Array State : AA.. ('A' == active, '.' == missing)
> 
> Is there a way to repair the raid?
> 
> thanks!
> Dietrich
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2012-07-09  0:12 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-08 19:05 RAID5 faild while in degraded mode, need help Dietrich Heise
2012-07-09  0:12 ` NeilBrown [this message]
2012-07-09 11:02   ` Dietrich Heise
2012-07-09 23:02     ` NeilBrown
2012-07-11 18:50       ` Dietrich Heise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120709101208.49528808@notabene.brown \
    --to=neilb@suse.de \
    --cc=dh@dhde.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).