Help with Failed array

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Help with Failed array
       [not found] <dd404eb00907110001n177f9fabpb394aac706d3e90@mail.gmail.com>
@ 2009-07-11  7:46 ` Thomas Kenyon
  2009-07-11 14:29   ` Sujit Karataparambil
  2009-07-12 11:47   ` David Greaves
  0 siblings, 2 replies; 4+ messages in thread
From: Thomas Kenyon @ 2009-07-11  7:46 UTC (permalink / raw)
  To: linux-raid

My server at home operates a 4 disk software RAID 5 array which is
normally mounted at /.

At the moment I am using mdadm v2.6.7.2 (I don't know which version
built the array).

One of the disks appeared to have developed a fault, an I/O error
would be produced and the array would rebuild to try and mao round it,
and once it had finished there'd be another I/O error and it would
start again.

I marked the drive as faulty, removed it from the array, replaced it
with a nother drive, replicated the partition map and added it to the
array.

As expected the drive started being built.

I'm not sure if it had finished by this point, but another disk
produced an I/O error which broke the array.

Now I'm trying to recover any data I can from it.

The partitions in the array are sd[abcd]1. (SATA controller).

sdb is the drive that originally failed and has been replaced. sdc is
the drive that took everything down.

When I first try to assemble the array, this message appears in kernel messages.

md: kicking non-fresh sdc1 from array!

and this appears in the console.

mdadm: /dev/md0 assembled from 2 drives and 1 spare - not enough to
start the array.

even if I use --force.

with this in mdstat

Personalities : [raid6] [raid5] [raid4]
md0 : inactive sdd1[2](S) sdc1[1](S) sdb1[5](S) sda1[4](S)
      1937888128 blocks super 1.0

unused devices: <none>

(well, all the [S], except the one on sdb1 appearing is new).

If I try to fail and remove sdc1 and then reinsert it I get.

mdadm /dev/md0 -f /dev/sdc1
mdadm: cannot get array info for /dev/md0

from examine I get

/dev/sda1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
           Name : 'chef':1
  Creation Time : Mon Jul 21 15:58:36 2008
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 968944072 (462.03 GiB 496.10 GB)
     Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
  Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
   Super Offset : 968944328 sectors
          State : clean
    Device UUID : 0bf37dbc:000ee685:7bcf601b:ff125af1

    Update Time : Fri Jul 10 18:45:04 2009
       Checksum : ef734a25 - correct
         Events : 19518

         Layout : left-symmetric
     Chunk Size : 64K

    Array Slot : 4 (failed, failed, 2, failed, 3)
   Array State : __uU 3 failed

/dev/sdb1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
           Name : 'chef':1
  Creation Time : Mon Jul 21 15:58:36 2008
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 968944200 (462.03 GiB 496.10 GB)
     Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
  Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
   Super Offset : 968944328 sectors
          State : clean
    Device UUID : bb71f618:9d9be6e2:84bd3fdb:02543dfb

    Update Time : Fri Jul 10 18:45:04 2009
       Checksum : 72ae8ca8 - correct
         Events : 19518

         Layout : left-symmetric
     Chunk Size : 64K

    Array Slot : 5 (failed, failed, 2, failed, 3)
   Array State : __uu 3 failed

/dev/sdc1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
           Name : 'chef':1
  Creation Time : Mon Jul 21 15:58:36 2008
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 968944200 (462.03 GiB 496.10 GB)
     Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
  Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
   Super Offset : 968944328 sectors
          State : clean
    Device UUID : 0259688e:21902f87:a4da25cd:a5042e6f

    Update Time : Fri Jul 10 18:45:04 2009
       Checksum : f2402efa - correct
         Events : 19375

         Layout : left-symmetric
     Chunk Size : 64K

    Array Slot : 1 (failed, failed, 2, failed, 3)
   Array State : __uu 3 failed

/dev/sdd1:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x0
     Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
           Name : 'chef':1
  Creation Time : Mon Jul 21 15:58:36 2008
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 968944072 (462.03 GiB 496.10 GB)
     Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
  Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
   Super Offset : 968944328 sectors
          State : clean
    Device UUID : 1f3b1479:3e01d8dd:d8ce565a:cb13e780

    Update Time : Fri Jul 10 18:45:04 2009
       Checksum : d27e859e - correct
         Events : 19518

         Layout : left-symmetric
     Chunk Size : 64K

    Array Slot : 2 (failed, failed, 2, failed, 3)
   Array State : __Uu 3 failed

As you can see sdc1's event count is more than a little out, oddly
sdb1 is in State: clean.

I've run out of ideas, can somoneone please give me a point in the
right direction as to how to recover some of the data from this.

TIA for any help received.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Help with Failed array
  2009-07-11  7:46 ` Help with Failed array Thomas Kenyon
@ 2009-07-11 14:29   ` Sujit Karataparambil
  2009-07-11 14:35     ` Twigathy
  2009-07-12 11:47   ` David Greaves
  1 sibling, 1 reply; 4+ messages in thread
From: Sujit Karataparambil @ 2009-07-11 14:29 UTC (permalink / raw)
  To: Thomas Kenyon; +Cc: linux-raid

I donot think that is how raid5 works. You should not have removed
the first faulty disk in the first place.

I think how raid5 works is that it tends to have an parity information
stored in an particular drive.

Correct me on this if it is wrong?

On Sat, Jul 11, 2009 at 1:16 PM, Thomas Kenyon<zhadnost@googlemail.com> wrote:
> My server at home operates a 4 disk software RAID 5 array which is
> normally mounted at /.
>
> At the moment I am using mdadm v2.6.7.2 (I don't know which version
> built the array).
>
> One of the disks appeared to have developed a fault, an I/O error
> would be produced and the array would rebuild to try and mao round it,
> and once it had finished there'd be another I/O error and it would
> start again.
>
> I marked the drive as faulty, removed it from the array, replaced it
> with a nother drive, replicated the partition map and added it to the
> array.
>
> As expected the drive started being built.
>
> I'm not sure if it had finished by this point, but another disk
> produced an I/O error which broke the array.
>
> Now I'm trying to recover any data I can from it.
>
> The partitions in the array are sd[abcd]1. (SATA controller).
>
> sdb is the drive that originally failed and has been replaced. sdc is
> the drive that took everything down.
>
> When I first try to assemble the array, this message appears in kernel messages.
>
> md: kicking non-fresh sdc1 from array!
>
> and this appears in the console.
>
> mdadm: /dev/md0 assembled from 2 drives and 1 spare - not enough to
> start the array.
>
> even if I use --force.
>
> with this in mdstat
>
> Personalities : [raid6] [raid5] [raid4]
> md0 : inactive sdd1[2](S) sdc1[1](S) sdb1[5](S) sda1[4](S)
>       1937888128 blocks super 1.0
>
> unused devices: <none>
>
> (well, all the [S], except the one on sdb1 appearing is new).
>
> If I try to fail and remove sdc1 and then reinsert it I get.
>
> mdadm /dev/md0 -f /dev/sdc1
> mdadm: cannot get array info for /dev/md0
>
> from examine I get
>
> /dev/sda1:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x0
>      Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
>            Name : 'chef':1
>   Creation Time : Mon Jul 21 15:58:36 2008
>      Raid Level : raid5
>    Raid Devices : 4
>
>  Avail Dev Size : 968944072 (462.03 GiB 496.10 GB)
>      Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
>   Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
>    Super Offset : 968944328 sectors
>           State : clean
>     Device UUID : 0bf37dbc:000ee685:7bcf601b:ff125af1
>
>     Update Time : Fri Jul 10 18:45:04 2009
>        Checksum : ef734a25 - correct
>          Events : 19518
>
>          Layout : left-symmetric
>      Chunk Size : 64K
>
>     Array Slot : 4 (failed, failed, 2, failed, 3)
>    Array State : __uU 3 failed
>
> /dev/sdb1:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x0
>      Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
>            Name : 'chef':1
>   Creation Time : Mon Jul 21 15:58:36 2008
>      Raid Level : raid5
>    Raid Devices : 4
>
>  Avail Dev Size : 968944200 (462.03 GiB 496.10 GB)
>      Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
>   Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
>    Super Offset : 968944328 sectors
>           State : clean
>     Device UUID : bb71f618:9d9be6e2:84bd3fdb:02543dfb
>
>     Update Time : Fri Jul 10 18:45:04 2009
>        Checksum : 72ae8ca8 - correct
>          Events : 19518
>
>          Layout : left-symmetric
>      Chunk Size : 64K
>
>     Array Slot : 5 (failed, failed, 2, failed, 3)
>    Array State : __uu 3 failed
>
> /dev/sdc1:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x0
>      Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
>            Name : 'chef':1
>   Creation Time : Mon Jul 21 15:58:36 2008
>      Raid Level : raid5
>    Raid Devices : 4
>
>  Avail Dev Size : 968944200 (462.03 GiB 496.10 GB)
>      Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
>   Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
>    Super Offset : 968944328 sectors
>           State : clean
>     Device UUID : 0259688e:21902f87:a4da25cd:a5042e6f
>
>     Update Time : Fri Jul 10 18:45:04 2009
>        Checksum : f2402efa - correct
>          Events : 19375
>
>          Layout : left-symmetric
>      Chunk Size : 64K
>
>     Array Slot : 1 (failed, failed, 2, failed, 3)
>    Array State : __uu 3 failed
>
> /dev/sdd1:
>           Magic : a92b4efc
>         Version : 1.0
>     Feature Map : 0x0
>      Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
>            Name : 'chef':1
>   Creation Time : Mon Jul 21 15:58:36 2008
>      Raid Level : raid5
>    Raid Devices : 4
>
>  Avail Dev Size : 968944072 (462.03 GiB 496.10 GB)
>      Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
>   Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
>    Super Offset : 968944328 sectors
>           State : clean
>     Device UUID : 1f3b1479:3e01d8dd:d8ce565a:cb13e780
>
>     Update Time : Fri Jul 10 18:45:04 2009
>        Checksum : d27e859e - correct
>          Events : 19518
>
>          Layout : left-symmetric
>      Chunk Size : 64K
>
>     Array Slot : 2 (failed, failed, 2, failed, 3)
>    Array State : __Uu 3 failed
>
> As you can see sdc1's event count is more than a little out, oddly
> sdb1 is in State: clean.
>
> I've run out of ideas, can somoneone please give me a point in the
> right direction as to how to recover some of the data from this.
>
> TIA for any help received.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
-- Sujit K M
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Help with Failed array
  2009-07-11 14:29   ` Sujit Karataparambil
@ 2009-07-11 14:35     ` Twigathy
  0 siblings, 0 replies; 4+ messages in thread
From: Twigathy @ 2009-07-11 14:35 UTC (permalink / raw)
  To: Sujit Karataparambil; +Cc: Thomas Kenyon, linux-raid

That's raid4. Raid5 has distributed parity. :)

T

2009/7/11 Sujit Karataparambil <sjt.kar@gmail.com>:
> I donot think that is how raid5 works. You should not have removed
> the first faulty disk in the first place.
>
> I think how raid5 works is that it tends to have an parity information
> stored in an particular drive.
>
> Correct me on this if it is wrong?
>
> On Sat, Jul 11, 2009 at 1:16 PM, Thomas Kenyon<zhadnost@googlemail.com> wrote:
>> My server at home operates a 4 disk software RAID 5 array which is
>> normally mounted at /.
>>
>> At the moment I am using mdadm v2.6.7.2 (I don't know which version
>> built the array).
>>
>> One of the disks appeared to have developed a fault, an I/O error
>> would be produced and the array would rebuild to try and mao round it,
>> and once it had finished there'd be another I/O error and it would
>> start again.
>>
>> I marked the drive as faulty, removed it from the array, replaced it
>> with a nother drive, replicated the partition map and added it to the
>> array.
>>
>> As expected the drive started being built.
>>
>> I'm not sure if it had finished by this point, but another disk
>> produced an I/O error which broke the array.
>>
>> Now I'm trying to recover any data I can from it.
>>
>> The partitions in the array are sd[abcd]1. (SATA controller).
>>
>> sdb is the drive that originally failed and has been replaced. sdc is
>> the drive that took everything down.
>>
>> When I first try to assemble the array, this message appears in kernel messages.
>>
>> md: kicking non-fresh sdc1 from array!
>>
>> and this appears in the console.
>>
>> mdadm: /dev/md0 assembled from 2 drives and 1 spare - not enough to
>> start the array.
>>
>> even if I use --force.
>>
>> with this in mdstat
>>
>> Personalities : [raid6] [raid5] [raid4]
>> md0 : inactive sdd1[2](S) sdc1[1](S) sdb1[5](S) sda1[4](S)
>>       1937888128 blocks super 1.0
>>
>> unused devices: <none>
>>
>> (well, all the [S], except the one on sdb1 appearing is new).
>>
>> If I try to fail and remove sdc1 and then reinsert it I get.
>>
>> mdadm /dev/md0 -f /dev/sdc1
>> mdadm: cannot get array info for /dev/md0
>>
>> from examine I get
>>
>> /dev/sda1:
>>           Magic : a92b4efc
>>         Version : 1.0
>>     Feature Map : 0x0
>>      Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
>>            Name : 'chef':1
>>   Creation Time : Mon Jul 21 15:58:36 2008
>>      Raid Level : raid5
>>    Raid Devices : 4
>>
>>  Avail Dev Size : 968944072 (462.03 GiB 496.10 GB)
>>      Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
>>   Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
>>    Super Offset : 968944328 sectors
>>           State : clean
>>     Device UUID : 0bf37dbc:000ee685:7bcf601b:ff125af1
>>
>>     Update Time : Fri Jul 10 18:45:04 2009
>>        Checksum : ef734a25 - correct
>>          Events : 19518
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>     Array Slot : 4 (failed, failed, 2, failed, 3)
>>    Array State : __uU 3 failed
>>
>> /dev/sdb1:
>>           Magic : a92b4efc
>>         Version : 1.0
>>     Feature Map : 0x0
>>      Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
>>            Name : 'chef':1
>>   Creation Time : Mon Jul 21 15:58:36 2008
>>      Raid Level : raid5
>>    Raid Devices : 4
>>
>>  Avail Dev Size : 968944200 (462.03 GiB 496.10 GB)
>>      Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
>>   Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
>>    Super Offset : 968944328 sectors
>>           State : clean
>>     Device UUID : bb71f618:9d9be6e2:84bd3fdb:02543dfb
>>
>>     Update Time : Fri Jul 10 18:45:04 2009
>>        Checksum : 72ae8ca8 - correct
>>          Events : 19518
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>     Array Slot : 5 (failed, failed, 2, failed, 3)
>>    Array State : __uu 3 failed
>>
>> /dev/sdc1:
>>           Magic : a92b4efc
>>         Version : 1.0
>>     Feature Map : 0x0
>>      Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
>>            Name : 'chef':1
>>   Creation Time : Mon Jul 21 15:58:36 2008
>>      Raid Level : raid5
>>    Raid Devices : 4
>>
>>  Avail Dev Size : 968944200 (462.03 GiB 496.10 GB)
>>      Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
>>   Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
>>    Super Offset : 968944328 sectors
>>           State : clean
>>     Device UUID : 0259688e:21902f87:a4da25cd:a5042e6f
>>
>>     Update Time : Fri Jul 10 18:45:04 2009
>>        Checksum : f2402efa - correct
>>          Events : 19375
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>     Array Slot : 1 (failed, failed, 2, failed, 3)
>>    Array State : __uu 3 failed
>>
>> /dev/sdd1:
>>           Magic : a92b4efc
>>         Version : 1.0
>>     Feature Map : 0x0
>>      Array UUID : 62f0dee0:e3a81d0c:68df2ad8:1ecb435b
>>            Name : 'chef':1
>>   Creation Time : Mon Jul 21 15:58:36 2008
>>      Raid Level : raid5
>>    Raid Devices : 4
>>
>>  Avail Dev Size : 968944072 (462.03 GiB 496.10 GB)
>>      Array Size : 2906832000 (1386.09 GiB 1488.30 GB)
>>   Used Dev Size : 968944000 (462.03 GiB 496.10 GB)
>>    Super Offset : 968944328 sectors
>>           State : clean
>>     Device UUID : 1f3b1479:3e01d8dd:d8ce565a:cb13e780
>>
>>     Update Time : Fri Jul 10 18:45:04 2009
>>        Checksum : d27e859e - correct
>>          Events : 19518
>>
>>          Layout : left-symmetric
>>      Chunk Size : 64K
>>
>>     Array Slot : 2 (failed, failed, 2, failed, 3)
>>    Array State : __Uu 3 failed
>>
>> As you can see sdc1's event count is more than a little out, oddly
>> sdb1 is in State: clean.
>>
>> I've run out of ideas, can somoneone please give me a point in the
>> right direction as to how to recover some of the data from this.
>>
>> TIA for any help received.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> -- Sujit K M
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Help with Failed array
  2009-07-11  7:46 ` Help with Failed array Thomas Kenyon
  2009-07-11 14:29   ` Sujit Karataparambil
@ 2009-07-12 11:47   ` David Greaves
  1 sibling, 0 replies; 4+ messages in thread
From: David Greaves @ 2009-07-12 11:47 UTC (permalink / raw)
  To: Thomas Kenyon; +Cc: linux-raid

Thomas Kenyon wrote:
> My server at home operates a 4 disk software RAID 5 array which is
> normally mounted at /.
> 
> At the moment I am using mdadm v2.6.7.2 (I don't know which version
> built the array).
Also kernel version.


> One of the disks appeared to have developed a fault, an I/O error
> would be produced and the array would rebuild to try and mao round it,
> and once it had finished there'd be another I/O error and it would
> start again.
> 
> I marked the drive as faulty, removed it from the array, replaced it
> with a nother drive, replicated the partition map and added it to the
> array.
> 
> As expected the drive started being built.
Good

> I'm not sure if it had finished by this point, but another disk
> produced an I/O error which broke the array.
During the rebuild? That doesn't make sense since below it's showing 'clean'.
Are you sure the rebuild didn't finish?

If not then you may need expert support (probably from Neil on monday).
I suggest digging out the logs to see.

> Now I'm trying to recover any data I can from it.
Normally with the data you describe below the /dev/sd[abd]1 partitions would
happily create a degraded array.
However if you are sure sdb is broken then I'd wait.

> When I first try to assemble the array, this message appears in kernel messages.
command?

David

-- 
"Don't worry, you'll be fine; I saw it work in a cartoon once..."

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-07-12 11:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <dd404eb00907110001n177f9fabpb394aac706d3e90@mail.gmail.com>
2009-07-11  7:46 ` Help with Failed array Thomas Kenyon
2009-07-11 14:29   ` Sujit Karataparambil
2009-07-11 14:35     ` Twigathy
2009-07-12 11:47   ` David Greaves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).