Re: RAID6 12 device assemble force failure

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
To: Adam Niescierowicz <adam.niescierowicz@justnet.pl>
Cc: linux-raid@vger.kernel.org
Subject: Re: RAID6 12 device assemble force failure
Date: Mon, 1 Jul 2024 10:51:53 +0200	[thread overview]
Message-ID: <20240701105153.000066f3@linux.intel.com> (raw)
In-Reply-To: <56a413f1-6c94-4daf-87bc-dc85b9b87c7a@justnet.pl>

Hello Adam,
I hope you have backup! Citation from raid wiki linked below:

"Remember, RAID is not a backup! If you lose redundancy, you need to take a
backup!"

I'm not native raid expert but I will try to give you some clues.

On Sat, 29 Jun 2024 17:17:54 +0200
Adam Niescierowicz <adam.niescierowicz@justnet.pl> wrote:

> Hi,
> 
> i have raid 6 array on 12 disk attached via external SAS backplane 
> connected by 4 luns to the server. After some problems with backplane 
> when 3 disk went offline (in one second) and array stop.
> 

And raid is considered as failed by mdadm and it is persistent with state of
the devices in metadata.

>     Device Role : spare
>     Array State : AAAAA.AA.A.A ('A' == active, '.' == missing, 'R' == 
> replacing)

3 missing = failed raid 6 array.

> I think the problem is that disk are recognised as spare, but why?

Because mdadm cannot trust them because they are reported as "missing" so they
are not configured as raid devices (spare is default state).

> I tried with `mdadm --assemble --force --update=force-no-bbl 

It remove badblocks but not revert devices from "missing" to "active".

> /dev/sd{q,p,o,n,m,z,y,z,w,t,s,r}1` and now mdam -E shows
> 
> 
> ---
> 
>            Magic : a92b4efc
>          Version : 1.2
>      Feature Map : 0x1
>       Array UUID : f8fb0d5d:5cacae2e:12bf1656:18264fb5
>             Name : backup:card1port1chassis2
>    Creation Time : Tue Jun 18 20:07:19 2024
>       Raid Level : raid6
>     Raid Devices : 12
> 
>   Avail Dev Size : 39063382016 sectors (18.19 TiB 20.00 TB)
>       Array Size : 195316910080 KiB (181.90 TiB 200.00 TB)
>      Data Offset : 264192 sectors
>     Super Offset : 8 sectors
>     Unused Space : before=264104 sectors, after=0 sectors
>            State : clean
>      Device UUID : e726c6bc:11415fcc:49e8e0a5:041b69e4
> 
> Internal Bitmap : 8 sectors from superblock
>      Update Time : Fri Jun 28 22:21:57 2024
>         Checksum : 9ad1554c - correct
>           Events : 48640
> 
>           Layout : left-symmetric
>       Chunk Size : 512K
> 
>     Device Role : spare
>     Array State : AAAAA.AA.A.A ('A' == active, '.' == missing, 'R' == 
> replacing)
> ---
> 
> 
> What can I do to start this array?

 You may try to add them manually. I know that there is
--re-add functionality but I've never used it. Maybe something like that would
work:
#mdadm --remove /dev/md126 <failed drive>
#mdadm --re-add /dev/md126 <failed_drive>

If you will recover one drive this way, array should start but data might be not
consistent, please be aware of that!

Drive should be restored to sync state in details.

I highly advice you to simulate this scenario on not infrastructure
critical setup. As i said, I'm not a native raid expert.

for more suggestions see:
https://raid.wiki.kernel.org/index.php/Replacing_a_failed_drive

Thanks,
Mariusz

next prev parent reply	other threads:[~2024-07-01  8:51 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-29 15:17 RAID6 12 device assemble force failure Adam Niescierowicz
2024-07-01  8:51 ` Mariusz Tkaczyk [this message]
2024-07-01  9:33   ` Adam Niescierowicz
2024-07-02  8:47     ` Mariusz Tkaczyk
2024-07-02 17:47       ` Adam Niescierowicz
2024-07-03  7:42         ` Mariusz Tkaczyk
2024-07-03 10:16           ` Mariusz Tkaczyk
2024-07-03 21:10             ` Adam Niescierowicz
2024-07-04 11:06               ` Mariusz Tkaczyk
2024-07-04 12:35                 ` Adam Niescierowicz
2024-07-05 11:02                   ` Mariusz Tkaczyk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240701105153.000066f3@linux.intel.com \
    --to=mariusz.tkaczyk@linux.intel.com \
    --cc=adam.niescierowicz@justnet.pl \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).