Re: Help with data recovery - RAID6 with 2 failed drives and another with broken sectors

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Phil Turmel <philip@turmel.org>
To: "Michał Sawicz" <michal@sawicz.net>
Cc: linux-raid@vger.kernel.org
Subject: Re: Help with data recovery - RAID6 with 2 failed drives and another with broken sectors
Date: Sun, 06 Oct 2013 17:44:15 -0400	[thread overview]
Message-ID: <5251D9AF.9030402@turmel.org> (raw)
In-Reply-To: <524B2158.2020900@sawicz.net>

Hi Michał,

On 10/01/2013 03:24 PM, Michał Sawicz wrote:
> On 01.10.2013 01:23, Michał Sawicz wrote:

[trim /]

>> What I'd like to do first is to make sure the array rebuilds onto the 6
>> healthy drives, regardless of the bad blocks, I can probably recover the
>> data (assuming I can find out which files were affected - any
>> pointers?), but if the array doesn't rebuild correctly, I'm afraid it's
>> gonna get worse, and soon.
> 
> OK, so a ddrescue and --zero-superblock later my array is rebuilding
> onto one healthy spare. According to ddrescue I only lost some 8kB of
> data in more or less one chunk, so after the array is rebuilt my next
> task will be finding which file(s) that was.

I noticed that you never got any direct response, and I realized you
might still be at risk.  In particular, your OP said:

> As a side note... I've a full array scrub enabled on the array every now
> and again - and it did run after the disk started failing blocks, but
> they never got reallocated, they all remain pending / uncorrectable. Is
> that expected?

The answer is *NO*.  That is not expected.  But it does happen with
timeout mismatches, and the double failure you experienced is a common
result of error correction timeout mismatch.  Timeout mismatch is where
your drives are internally trying to retry reading a bad sector long
after the OS has given up.  It is always associated with consumer-grade
hard drives in raid arrays.

You might want to search the list archives for various combinations of
"error recovery", "scterc", "URE" and "timeout mismatch" for a full
description of the problem and the recommended ways to avoid it.

HTH,

Phil

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2013-10-06 21:44 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-30 23:23 Help with data recovery - RAID6 with 2 failed drives and another with broken sectors Michał Sawicz
2013-10-01 19:24 ` Michał Sawicz
2013-10-06 21:44   ` Phil Turmel [this message]
2013-10-06 22:11     ` Michał Sawicz
2013-10-06 22:15       ` Phil Turmel
2013-10-06 22:56         ` Michał Sawicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5251D9AF.9030402@turmel.org \
    --to=philip@turmel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=michal@sawicz.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.