* raid 5, drives marked as failed. Can I recover?
@ 2009-01-29 23:17 Tom
2009-01-30 14:18 ` Justin Piszcz
0 siblings, 1 reply; 5+ messages in thread
From: Tom @ 2009-01-29 23:17 UTC (permalink / raw)
To: linux-raid
Hello,
2 drives have failed on my raid5 setup and I need to recover the data
on the raid.
I am sure that the drives still works or at least one of them still works.
How do I recover my drives?
I can't mount the raid no more and I am missing a hard drive when i
run ls /dev/sd?
I have 7 drives on my raid.
Here is output of /var/log/messages in following link
http://matx.pastebin.com/m35423452
also some more information
tom@desu ~ $ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
[raid4] [multipath]
md2 : inactive sdc1[1] sdd1[4] sdf1[3] sde1[2]
1953214208 blocks
tom@desu ~ $ sudo mdadm --detail /dev/md2
Password:
/dev/md2:
Version : 00.90.03
Creation Time : Thu Sep 4 20:14:31 2008
Raid Level : raid5
Used Dev Size : 488303552 (465.68 GiB 500.02 GB)
Raid Devices : 7
Total Devices : 4
Preferred Minor : 2
Persistence : Superblock is persistent
Update Time : Thu Jan 29 21:16:34 2009
State : active, degraded, Not Started
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : cf3d1948:1d0e65b6:c028c7c8:
56f0c54c
Events : 0.1411738
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 33 1 active sync /dev/sdc1
2 8 65 2 active sync /dev/sde1
3 8 81 3 active sync /dev/sdf1
4 8 49 4 active sync /dev/sdd1
5 0 0 5 removed
6 0 0 6 removed
Thank you for your time in advance.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: raid 5, drives marked as failed. Can I recover?
2009-01-29 23:17 raid 5, drives marked as failed. Can I recover? Tom
@ 2009-01-30 14:18 ` Justin Piszcz
2009-01-30 14:56 ` David Greaves
0 siblings, 1 reply; 5+ messages in thread
From: Justin Piszcz @ 2009-01-30 14:18 UTC (permalink / raw)
To: Tom; +Cc: linux-raid
Try to assmeble the array with --force.
On Thu, 29 Jan 2009, Tom wrote:
> Hello,
>
> 2 drives have failed on my raid5 setup and I need to recover the data
> on the raid.
> I am sure that the drives still works or at least one of them still works.
>
> How do I recover my drives?
>
> I can't mount the raid no more and I am missing a hard drive when i
> run ls /dev/sd?
> I have 7 drives on my raid.
>
> Here is output of /var/log/messages in following link
>
> http://matx.pastebin.com/m35423452
>
> also some more information
>
> tom@desu ~ $ cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
> [raid4] [multipath]
> md2 : inactive sdc1[1] sdd1[4] sdf1[3] sde1[2]
> 1953214208 blocks
>
>
> tom@desu ~ $ sudo mdadm --detail /dev/md2
> Password:
> /dev/md2:
> Version : 00.90.03
> Creation Time : Thu Sep 4 20:14:31 2008
> Raid Level : raid5
> Used Dev Size : 488303552 (465.68 GiB 500.02 GB)
> Raid Devices : 7
> Total Devices : 4
> Preferred Minor : 2
> Persistence : Superblock is persistent
>
> Update Time : Thu Jan 29 21:16:34 2009
> State : active, degraded, Not Started
> Active Devices : 4
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 0
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> UUID : cf3d1948:1d0e65b6:c028c7c8:
> 56f0c54c
> Events : 0.1411738
>
> Number Major Minor RaidDevice State
> 0 0 0 0 removed
> 1 8 33 1 active sync /dev/sdc1
> 2 8 65 2 active sync /dev/sde1
> 3 8 81 3 active sync /dev/sdf1
> 4 8 49 4 active sync /dev/sdd1
> 5 0 0 5 removed
> 6 0 0 6 removed
>
>
>
>
> Thank you for your time in advance.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: raid 5, drives marked as failed. Can I recover?
2009-01-30 14:18 ` Justin Piszcz
@ 2009-01-30 14:56 ` David Greaves
2009-01-30 15:06 ` Tom
0 siblings, 1 reply; 5+ messages in thread
From: David Greaves @ 2009-01-30 14:56 UTC (permalink / raw)
To: Justin Piszcz; +Cc: Tom, linux-raid
Justin Piszcz wrote:
> Try to assmeble the array with --force.
hmmmm? not yet...
> On Thu, 29 Jan 2009, Tom wrote:
>
>> Hello,
>>
>> 2 drives have failed on my raid5 setup and I need to recover the data
>> on the raid.
>> I am sure that the drives still works or at least one of them still
>> works.
>>
>> How do I recover my drives?
How important is it?
The more important the data the more you should reduce the risk of a subsequent
failure.
If you "don't care" then we just force it back together and cross fingers.
Otherwise we run tests on all the drives before trying a restore.
I'd say to run these tests on each drive; as a minimum do the first test on the
failed drive, more paranoia, more tests and include the non-failed drives (to
ensure they don't fail during recovery):
* smartctl -t short
* smartctl -t long
* badblocks
What happened? Smoke?
Are the drives faulty (what does smartctl -a tell you)
Did the cables just wiggle? Is the controller broken?
You probably don't know :)
I would obtain replacements for the failed drives and use ddrescue to copy from
the failed drive to a replacement.
Then install the good drives and begin recovery.
>> I can't mount the raid no more and I am missing a hard drive when i
>> run ls /dev/sd?
>> I have 7 drives on my raid.
You say you have 7 drives and 2 are failed.
And yet I see 4 drives, not 5.
Where is sdg?
>> Here is output of /var/log/messages in following link
>>
>> http://matx.pastebin.com/m35423452
Jan 29 21:14:11 sda died
Jan 29 21:14:12 sdb died
>>
>> also some more information
Also need:
Distro
Kernel version
Mdadm version
mdadm --examine for each available component.
David
--
"Don't worry, you'll be fine; I saw it work in a cartoon once..."
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: raid 5, drives marked as failed. Can I recover?
2009-01-30 14:56 ` David Greaves
@ 2009-01-30 15:06 ` Tom
2009-01-30 15:24 ` David Greaves
0 siblings, 1 reply; 5+ messages in thread
From: Tom @ 2009-01-30 15:06 UTC (permalink / raw)
To: David Greaves, jpiszcz; +Cc: linux-raid
Hello,
I spent a night trying out mdadm --assemble on a virtual machine to
see how it attempts to fix a raid where 2 or more drives have been
marked faulty.
I was quite sure that the drives were fine and that they were wrongly
marked as bad.
I think I just have a bad ata controller.
I used --assemble on real machine and it seemed to have detected the raid again.
1 drive was found to be bad and it is recreating it now.
But my data is there and I can open it.
I am going to get some dvd's and back all this up before it dies again!
Regards and thanks for your help!
2009/1/30 David Greaves <david@dgreaves.com>:
> Justin Piszcz wrote:
>> Try to assmeble the array with --force.
> hmmmm? not yet...
>
>
>> On Thu, 29 Jan 2009, Tom wrote:
>>
>>> Hello,
>>>
>>> 2 drives have failed on my raid5 setup and I need to recover the data
>>> on the raid.
>>> I am sure that the drives still works or at least one of them still
>>> works.
>>>
>>> How do I recover my drives?
>
> How important is it?
> The more important the data the more you should reduce the risk of a subsequent
> failure.
> If you "don't care" then we just force it back together and cross fingers.
> Otherwise we run tests on all the drives before trying a restore.
> I'd say to run these tests on each drive; as a minimum do the first test on the
> failed drive, more paranoia, more tests and include the non-failed drives (to
> ensure they don't fail during recovery):
> * smartctl -t short
> * smartctl -t long
> * badblocks
>
> What happened? Smoke?
> Are the drives faulty (what does smartctl -a tell you)
> Did the cables just wiggle? Is the controller broken?
> You probably don't know :)
>
> I would obtain replacements for the failed drives and use ddrescue to copy from
> the failed drive to a replacement.
> Then install the good drives and begin recovery.
>
>>> I can't mount the raid no more and I am missing a hard drive when i
>>> run ls /dev/sd?
>>> I have 7 drives on my raid.
> You say you have 7 drives and 2 are failed.
> And yet I see 4 drives, not 5.
>
> Where is sdg?
>
>>> Here is output of /var/log/messages in following link
>>>
>>> http://matx.pastebin.com/m35423452
>
> Jan 29 21:14:11 sda died
> Jan 29 21:14:12 sdb died
>
>>>
>>> also some more information
> Also need:
> Distro
> Kernel version
> Mdadm version
> mdadm --examine for each available component.
>
> David
>
> --
> "Don't worry, you'll be fine; I saw it work in a cartoon once..."
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: raid 5, drives marked as failed. Can I recover?
2009-01-30 15:06 ` Tom
@ 2009-01-30 15:24 ` David Greaves
0 siblings, 0 replies; 5+ messages in thread
From: David Greaves @ 2009-01-30 15:24 UTC (permalink / raw)
To: Tom; +Cc: jpiszcz, linux-raid
Tom wrote:
> Hello,
>
> I spent a night trying out mdadm --assemble on a virtual machine to
> see how it attempts to fix a raid where 2 or more drives have been
> marked faulty.
> I was quite sure that the drives were fine and that they were wrongly
> marked as bad.
> I think I just have a bad ata controller.
Given 2 drives died in 1 second then I'd agree.
> I used --assemble on real machine and it seemed to have detected the raid again.
> 1 drive was found to be bad and it is recreating it now.
> But my data is there and I can open it.
> I am going to get some dvd's and back all this up before it dies again!
OK, that's good :)
A forced assemble will make md assume all the disks are good and that all writes
succeeded. ie all is well.
They probably didn't and it probably isn't. OTOH you probably lost a few hundred
bytes in many many Gb so nothing to panic over.
You should fsck and, ideally, checksum compare your filesystem against a backup.
I would run a read-only fsck before doing anything. Then if you just have light
damage, repair.
David
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-01-30 15:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-29 23:17 raid 5, drives marked as failed. Can I recover? Tom
2009-01-30 14:18 ` Justin Piszcz
2009-01-30 14:56 ` David Greaves
2009-01-30 15:06 ` Tom
2009-01-30 15:24 ` David Greaves
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).