Re: Two Drive Failure on RAID-5

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Janos Haar" <janos.haar@netcenter.hu>
To: David Greaves <david@dgreaves.com>, cry_regarder@yahoo.com
Cc: linux-raid@vger.kernel.org
Subject: Re: Two Drive Failure on RAID-5
Date: Tue, 20 May 2008 14:17:51 +0200	[thread overview]
Message-ID: <033101c8ba73$87cbb9a0$9300a8c0@dcccs> (raw)
In-Reply-To: 4832966A.3010707@dgreaves.com


----- Original Message ----- 
From: "David Greaves" <david@dgreaves.com>
To: "Cry" <cry_regarder@yahoo.com>
Cc: <linux-raid@vger.kernel.org>
Sent: Tuesday, May 20, 2008 11:14 AM
Subject: Re: Two Drive Failure on RAID-5


> Cry wrote:
>> Folks,
>>
>> I had a drive fail on my 6 drive raid-5 array.  while syncing in the 
>> replacement
>> drive (11 percent complete) a second drive went bad.
>>
>> Any suggestions to recover as much data as possible from the array?
>
> Let us know if any step fails...
>
> How valuable is your data - if it is very valuable and you have no backups 
> then
> you may want to seek professional help.
>
> The replacement drive *may* help to rebuild up to 11% of your data in the 
> event
> that the bad drive fails completely. You can keep it to one side to try 
> this if
> you get really desperate.
>
> Assuming a real drive hardware failure (smartctl shows errors and dmesg 
> showed
> media errors or similar).
>
> I would first suggest using ddrescue to duplicate the 2nd failed drive 
> onto a
> spare drive (the replacement is fine if you want to risk that <11% of
> potentially saved data - a new drive would be better - you're going to 
> need a
> new one anyway!)
>
> SOURCE is the 2nd failed drive
> TARGET is it's replacement
>
> blockdev --getra /dev/SOURCE <note the readahead value>
> blockdev --setro /dev/SOURCE
> blockdev --setra  0 /dev/SOURCE
> ddrescue /dev/SOURCE /dev/TARGET /somewhere_safe/logfile
>
> Note, Janos Haar recently (18/may) posted a more conservative approach 
> that you
> may want to use. Additionally you may want to use a logfile
>
> ddrescue lets you know how much data it failed to recover. If this is a 
> lot then
> you may want to read up on the ddrescue info page (includes a tutorial and 
> lots
> of explanation) and consider drive data recovery tricks such as drive 
> cooling
> (which some sources suggest may cause more damage than they solve but has 
> worked
> for me in the past).
>
> I have also left ddrescue running overnight against a system that 
> repeatedly
> timed-out and in the morning I've had a *lot* more recovered data.
>
> Having *successfully* done that you can re-assemble the array using the 4 
> good
> disks and the newly duplicated one.
>
> unless you've rebooted:
> blockdev --setrw /dev/SOURCE
> blockdev --setra  <saved readahead value> /dev/SOURCE
>
> mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 
> /dev/sde1
>
> cat /proc/mdstat will show the drive status
> mdadm --detail /dev/md0
> mdadm --examine /dev/sd[abcdef]1 [components]
>
> Should all show a reasonably healthy but degraded array.
>
> This should now be amenable to a read-only fsck/xfs_repair/whatever.

Maybe COW loop helps a lot. ;-)

>
> If that looks reasonable then you may want to do a proper fsck, perform a 
> backup
> and add a new drive.
>
> HTH - let me know if any steps don't make sense; I think its about time I 
> put
> something on the wiki about data-recovery...
>
> David
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2008-05-20 12:17 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-19 22:49 Two Drive Failure on RAID-5 Cry
2008-05-20  7:37 ` David Greaves
2008-05-20 15:32   ` Cry
2008-05-20 17:18     ` David Lethe
2008-05-20 19:01       ` Cry
2008-05-20 20:09         ` David Lethe
2008-05-20 23:11           ` Keith Roberts
2008-05-20 19:40       ` Janos Haar
2008-05-20 17:27     ` David Lethe
2008-05-20 19:28     ` Brad Campbell
2008-05-20  9:14 ` David Greaves
2008-05-20 12:17   ` Janos Haar [this message]
2008-05-21 14:14   ` Cry
2008-05-21 20:15     ` David Greaves
2008-05-21 20:47       ` Janos Haar
2008-05-21 21:21         ` Cry
2008-05-22  8:38           ` David Greaves
2008-05-31  9:27             ` Cry
2008-05-22  0:05         ` Cry
  -- strict thread matches above, loose matches on Subject: below --
2008-05-21  0:24 Re: " David Lethe
2008-05-22 14:42 ` Ric Wheeler
2008-05-22 16:16   ` David Lethe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='033101c8ba73$87cbb9a0$9300a8c0@dcccs' \
    --to=janos.haar@netcenter.hu \
    --cc=cry_regarder@yahoo.com \
    --cc=david@dgreaves.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).