linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* recovering RAID5 from multiple disk failures
@ 2013-02-01 12:28 Michael Ritzert
  2013-02-01 13:21 ` Phil Turmel
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Ritzert @ 2013-02-01 12:28 UTC (permalink / raw)
  To: linux-raid

Hi all,

this looks bad:
I have a RAID5 that showed a disk error. The disk failed badly with read
errors. Apparantly, these happen to be at locations important to the file
system, as the RAID read speed was some kb/s with permanent timeouts
reading from the disk.
So I removed the disk from the RAID, to be able to take a backup. The
backup ran well for one directory, and then completely stopped. It turned
out another disk also suddenly showed read errors.

So the situation is: I have a four-disk RAID5 with two active disks, and
two that dropped out at different times.

I made 1:1 copies of all 4 disks with ddrescue, and the error report shows
that the errorneous regions do not overlap. So I hope there is a chance to
recover the data.
But for the filesystem mount, there were only read accesses to the array
after the first disk dropped out. So my strategy would be to convince md
to accept all disks as uptodate and treat the read errors on two disks,
and the differing filesystem metadata as RAID errors that can hopefully
be corrected.

The mdadm report for one of the disks looks like this:
/dev/sdb3:
          Magic : a92b4efc
        Version : 0.90.00
           UUID : f5ad617a:14ccd4b1:3d7a38e4:71465fe8
  Creation Time : Fri Nov 26 19:58:40 2010
     Raid Level : raid5
  Used Dev Size : 1951945600 (1861.52 GiB 1998.79 GB)
     Array Size : 5855836800 (5584.56 GiB 5996.38 GB)
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 0

    Update Time : Fri Jan  4 16:33:36 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 74966e68 - correct
         Events : 237

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       51        3      active sync

   0     0       0        0        0      removed
   1     1       8       19        1      active sync   /dev/sdb3
   2     2       0        0        2      faulty removed
   3     3       8       51        3      active sync

My first attempt would be to try
mdadm --create --metadata=0.9 --chunk=64 --assume-clean, etc.

Is there a chance for this to succeed? Or do you have better suggestions?

If all recovery that involves assembling the array fails: Is is possible
to manually assemble the data?
I'm thinking in the direction of: take the first 64k from disk1, then 64k
from disk2, etc.? This would probably take years to complete, but the data
is of really big importance to me (which is why I put it on a RAID in the
first place...).

Thanks,
Michael



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-02-03  0:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-01 12:28 recovering RAID5 from multiple disk failures Michael Ritzert
2013-02-01 13:21 ` Phil Turmel
2013-02-02 13:04   ` Michael Ritzert
2013-02-02 13:44     ` Phil Turmel
2013-02-02 20:20       ` Chris Murphy
2013-02-02 21:56         ` Michael Ritzert
2013-02-02 23:08           ` Chris Murphy
2013-02-03  0:23             ` Phil Turmel
2013-02-03  0:39               ` Chris Murphy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).