Re: Recover from crash in RAID6 due to hardware failure

public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed

From: Leslie Rhorer <lesrhorer@att.net>
To: Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: Recover from crash in RAID6 due to hardware failure
Date: Mon, 14 Jun 2021 20:33:20 -0500	[thread overview]
Message-ID: <1c28967e-3dbc-e5ff-9536-b8de0cf9cd65@att.net> (raw)
In-Reply-To: <ed21aa89-e6a1-651d-cc23-9f4c72cf63e0@gmail.com>

	There is a fair chance you can recover the data by recreating the array:

mdadm -S /dev/md2
mdadm -C -f -e 1.2 -n 6 -c 64K --level=6 -p left-symmetric /dev/md2 
/dev/sda3 /dev/sdb3 /dev/sdc3 /dev/sdd3 /dev/sde3

On 6/8/2021 6:39 AM, Carlos Maziero wrote:
> Em 07/06/2021 07:27, Leslie Rhorer escreveu:
>> On 6/6/2021 10:07 PM, Carlos Maziero wrote:
>>
>>> However, the disks where added as spares and the volume remained
>>> crashed. Now I'm afraid that such commands have erased metadata and made
>>> things worse... :-(
>>
>>      Yeah.  Did you at any time Examine the drives and save the output?
>>
>> mdadm -E /dev/sd[a-e]3
>>
>>      If so, you have a little bit better chance.
> 
> Yes, but I did it only after the failure. The output for all disks is
> attached to this message.
> 
> 
>>> Is there a way to reconstruct the array and to recover its data, at
>>> least partially?
>>
>>      Maybe.  Do you know eaxctly which physical disk was in which RAID
>> position?  It seems likely the grouping was the same for the corrupted
>> array as for the other arrays, given the drives are partitioned.
> 
> Yes, disk sda was in slot 1, and so on. I physically labelled all slots
> and disks.
> 
> 
>>
>>      First off, try:
>>
>> mdadm -E /dev/sde3 > /etc/mdadm/RAIDfix
>>
>>      This should give you the details of the RAID array.  From this,
>> you should be able to re-create the array.  I would heartily recommend
>> getting some new drives and copying the data to them before
>> proceeding.  I would get a 12T drive and copy all of the partitions to
>> it:
>>
>> mkfs /dev/sdf  (or mkfs /dev/sdf1)
>> mount /dev/sdf /mnt (or mount /dev/sdf1 /mnt)
>> ddrescue /dev/sda3 /mnt/drivea /tmp/tmpdrivea
>> ddrescue /dev/sdb3 /mnt/driveb /tmp/tmpdriveb
>> ddrescue /dev/sdc3 /mnt/drivec /tmp/tmpdrivec
>> ddrescue /dev/sdd3 /mnt/drived /tmp/tmpdrived
>> ddrescue /dev/sde3 /mnt/drivee /tmp/tmpdrivee
>>
>>      You could skimp by getting an 8T drive, and then if drive e
>> doesn't fit, you could create the array without it, and you will be
>> pretty safe.  It's not what I would do, but if you are strapped for
>> cash...
> 
> OK, I will try to have a secondary disk for that and another computer,
> since the NAS has only 5 bays and I would need one more for doing such
> operations.
> 
> 
>>> Contents of /proc/mdstat (after the commands above):
>>>
>>> Personalities : [raid1] [linear] [raid0] [raid10] [raid6] [raid5]
>>> [raid4]
>>> md2 : active raid6 sda3[0](S) sdb3[1](S) sdc3[2](S) sdd3[3](S) sde3[4]
>>>         8776632768 blocks super 1.2 level 6, 64k chunk, algorithm 2 [5/1]
>>> [____U]
>>>        md1 : active raid1 sda2[1] sdb2[2] sdc2[3] sdd2[0] sde2[4]
>>>         2097088 blocks [5/5] [UUUUU]
>>>        md0 : active raid1 sda1[1] sdb1[2] sdc1[3] sdd1[0] sde1[4]
>>>         2490176 blocks [5/5] [UUUUU]
>>
>>      There is something odd here.  You say the disks failed, but
>> clearly they are in decent shape.  The first and second partitions on
>> all drives appear to be good.  Did the system recover the RAID1 arrays?
> 
> Apparently the failure was not in the disks, but in the NAS hardware. I
> opened it one week ago for RAM upgrading (replaced the old 512M card by
> a 1GB one), and maybe the slot connecting the main board to the SATA
> board presented a connectivity problem (but the NAS OS said nothing
> about). Anyway, I had 5 disks in a RAID 6 array and the logs showed 3
> disks failing at the same time, which is quite unusual. This is the
> reason I believe the disks are physically ok.
> 
> Thanks for your attention!
> 
> Carlos
> 
>

next prev parent reply	other threads:[~2021-06-15  2:29 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-07  3:07 Recover from crash in RAID6 due to hardware failure Carlos Maziero
     [not found] ` <4745ddd9-291b-00c7-8678-cac14905c188@att.net>
     [not found]   ` <ed21aa89-e6a1-651d-cc23-9f4c72cf63e0@gmail.com>
2021-06-15  1:33     ` Leslie Rhorer [this message]
2021-06-15  1:36     ` Leslie Rhorer
2021-06-15  7:46       ` Roman Mamedov
2021-06-15 11:28       ` Carlos Maziero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1c28967e-3dbc-e5ff-9536-b8de0cf9cd65@att.net \
    --to=lesrhorer@att.net \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox