linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phil Turmel <philip@turmel.org>
To: Andreas Klauer <Andreas.Klauer@metamorpher.de>,
	Patrick Tschackert <Killing-Time@gmx.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: problems with dm-raid 6
Date: Mon, 21 Mar 2016 08:42:16 -0400	[thread overview]
Message-ID: <56EFEC28.7020207@turmel.org> (raw)
In-Reply-To: <20160320223727.GA29895@EIS>

Hi Patrick,

On 03/20/2016 06:37 PM, Andreas Klauer wrote:
> On Sun, Mar 20, 2016 at 10:44:57PM +0100, Patrick Tschackert wrote:
>> After rebooting the system, one of the harddisks was missing from my md raid 6 (the drive was /dev/sdf), so i rebuilt it with a hotspare that was already present in the system.
>> I physically removed the "missing" /dev/sdf drive after the restore and replaced it with a new drive.

Your smartctl output shows pending sector problems with sdf, sdh, and
sdj.  The latter are WD Reds that won't keep those problems through a
scrub, so I guess the smartctl report was from before that?

> Exact commands involved for those steps?
> 
> mdadm --examine output for your disks?

Yes, we want these.

>> $ cat /sys/block/md0/md/mismatch_cnt
>> 311936608
> 
> Basically the whole array out of whack.

Wow.

> This is what you get when you use --create --assume-clean on disks 
> that are not actually clean... or if you somehow convince md to 
> integrate a disk that does not have valid data on, for example 
> because you copied partition table and md metadata - but not  
> everything else - using dd.
> 
> Something really bad happened here and the only person who 
> can explain it, is probably yourself.

This is wrong.  Your mdadm -D output clearly shows a 2014 creation date,
so you definitely hadn't done --create --assume-clean at that point.
(Don't.)

> Your best bet is that the data is valid on n-2 disks.
> 
> Use overlay https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file
> 
> Assemble the overlay RAID with any 2 disks missing (try all combinations) and see if you get valid data.

No.  Something else is wrong, quite possibly hardware.  You don't get a
mismatch count like that without it showing up in smartctl too, unless
corrupt data was being written to one or more disks for a long time.

It's unclear from your dmesg what might have happened.  Probably bad
stuff going back years.

If you used ddrescue to replace sdf instead of letting mdadm reconstruct
it, that would have introduced zero sectors that would scramble your
encrypted filesystem.  Please let us know that you didn't use ddrescue.

The encryption inside your array will frustrate any attempt to do
per-member analysis.  I don't think there's anything still wrong with
the array (anything fixable, that is).

If an array error stomped on the key area of your dm-crypt layer, you
are totally destroyed, unless you happen to have a key backup you can
restore.

Otherwise you are at the mercy of fsck to try to fix your volume.  I
would use an overlay for that.

Phil

  reply	other threads:[~2016-03-21 12:42 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <trinity-235b76ed-571d-4615-b6f7-b4d5ed6a116d-1458509365312@3capp-gmx-bs09>
2016-03-20 21:44 ` problems with dm-raid 6 Patrick Tschackert
2016-03-20 22:37   ` Andreas Klauer
2016-03-21 12:42     ` Phil Turmel [this message]
2016-03-21 13:27       ` Andreas Klauer
2016-03-21 21:26       ` Chris Murphy
2016-03-21 21:38         ` Andreas Klauer
2016-03-21 21:46           ` Chris Murphy
2016-03-21 22:42           ` Patrick Tschackert
2016-03-21 22:54             ` Adam Goryachev
2016-03-21 23:15               ` Andreas Klauer
2016-03-21 23:48                 ` Adam Goryachev
2016-03-21 23:04             ` Andreas Klauer
2016-03-22  3:53               ` Chris Murphy
2016-03-22  4:22                 ` Chris Murphy
2016-03-21 22:06 Patrick Tschackert
  -- strict thread matches above, loose matches on Subject: below --
2016-03-21 22:19 Patrick Tschackert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56EFEC28.7020207@turmel.org \
    --to=philip@turmel.org \
    --cc=Andreas.Klauer@metamorpher.de \
    --cc=Killing-Time@gmx.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).