From: Andreas Klauer <Andreas.Klauer@metamorpher.de>
To: Alexander Shenkin <al@shenkin.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: recovering failed raid5
Date: Fri, 28 Oct 2016 15:33:04 +0200 [thread overview]
Message-ID: <20161028133304.GA11564@metamorpher.de> (raw)
In-Reply-To: <715b259f-1e56-9606-edc4-3e5c4d57744b@shenkin.org>
On Fri, Oct 28, 2016 at 01:22:31PM +0100, Alexander Shenkin wrote:
> One remaining question: is sdc definitely toast?
In my opinion a drive is toast starting from the very first reallocated/
pending/uncorrectable sector, your drive has several of those and that's
only the ones the drive already knows about - there may be more.
> Or, is it possible that the Timeout Mismatch (as mentioned by Robin Hill;
> thanks Robin) is flagging the drive as failed, when something else is at
> play and perhaps the drive is actually fine?
I don't believe in timeout mismatches, either. The timeouts are generous.
Waiting for a disk to wake from standby is not a problem, and that takes
ages already. If a disk gets stuck even longer in error correction limbo
and it gets kicked because of it - IMHO that's the right call.
A disk that is unable to read its data, a disk that refuses to write data,
a disk that needs help from the RAID layer to correct its errors,
should be kicked because it's not able to pull its own weight.
You need drives that work without errors, without outside help, because
during a rebuild, when the RAID is already degraded, there won't be any
outside help. Either the disks work or your RAID is dead.
RAID redundancy is supposed to allow disks be replaced. (mdadm --replace)
If you use it instead to keep fixing errors on other disks, there is not
any real redundancy left. In a RAID, if one of your disks has errors,
you get rid of it as soon as possible.
Your RAID did not fail because of timeouts or not. It's not important.
It failed because you didn't notice broken disks in time and you had two.
Testing, monitoring, actually acting on the first error, is important.
People have different opinions on this. Someone might argue.
It's up to you what risks to take.
Regards
Andreas Klauer
next prev parent reply other threads:[~2016-10-28 13:33 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-27 15:06 recovering failed raid5 Alexander Shenkin
2016-10-27 16:04 ` Andreas Klauer
2016-10-28 12:22 ` Alexander Shenkin
2016-10-28 13:33 ` Andreas Klauer [this message]
2016-10-28 21:16 ` Phil Turmel
2016-10-28 23:45 ` Andreas Klauer
2016-10-29 2:52 ` Edward Kuns
2016-10-29 2:53 ` Phil Turmel
2016-10-29 8:46 ` Mikael Abrahamsson
2016-10-29 10:29 ` Roman Mamedov
2016-10-29 12:02 ` Andreas Klauer
2016-10-30 16:18 ` Phil Turmel
2016-10-28 13:36 ` Robin Hill
2016-10-31 10:44 ` Alexander Shenkin
2016-10-31 11:09 ` Andreas Klauer
2016-10-31 15:19 ` Robin Hill
2016-10-31 16:26 ` Wols Lists
2016-10-31 16:28 ` Wols Lists
2016-11-16 9:04 ` Alexander Shenkin
2016-11-16 11:14 ` Andreas Klauer
2016-11-16 13:27 ` Alexander Shenkin
2016-11-16 13:59 ` Andreas Klauer
2016-11-16 15:35 ` Wols Lists
2016-11-16 15:50 ` Alexander Shenkin
2016-11-16 16:38 ` Wols Lists
2017-01-05 12:08 ` Alexander Shenkin
2016-10-31 16:31 ` Wols Lists
2016-10-27 16:26 ` Roman Mamedov
2016-10-27 20:34 ` Robin Hill
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161028133304.GA11564@metamorpher.de \
--to=andreas.klauer@metamorpher.de \
--cc=al@shenkin.org \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.