Re: Reconstruct a RAID 6 that has failed in a non typical manner

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Phil Turmel <philip@turmel.org>
To: NeilBrown <nfbrown@novell.com>,
	Clement Parisot <clement.parisot@inria.fr>
Cc: linux-raid@vger.kernel.org
Subject: Re: Reconstruct a RAID 6 that has failed in a non typical manner
Date: Mon, 21 Dec 2015 07:20:43 -0500	[thread overview]
Message-ID: <5677EE9B.7000607@turmel.org> (raw)
In-Reply-To: <87vb7s4gfv.fsf@notabene.neil.brown.name>

Good morning Neil,

On 12/20/2015 10:40 PM, NeilBrown wrote:
> On Fri, Nov 06 2015, Phil Turmel wrote:
>>
>> for x in /sys/block/*/device/timeout ; do echo 180 > $x ; done
>>
> 
> Would it make sense for mdadm to automagically do something like this?
> i.e. whenever it adds a device to an array (with redundancy) it write
> 180 (or something configurable) to the 'timeout' file if there is one?

Yes, I've been thinking this should be automagic, but I'm not sure if it
really belongs at the MD layer.

> Why do we pick 180?

I empirically determined that 120 was sufficient on the Seagate drives
that kicked my tail when I first figured this out.  Someone else (I'm
afraid I don't remember) found that to be not quite enough and suggested
180.

> Can this cause problems on some drives?

Not that I'm aware of, but it does make for rather troublesome
*application* stalls.

Considering that this aggressively long error recovery behavior is
*intended* for desktop drives or any non-redundant usage, I believe
linux shouldn't time out at 30 seconds by default.  It cuts off any
opportunity for these drives to report a good sector that is
reconstructed in more than 30 seconds.

Meanwhile, any device that *does* support scterc and/or has scterc
enabled out of the gate arguably should have a timeout just a few
seconds longer than the larger of the two error recovery settings.

I propose:

1) The kernel default timeout be set to 180 (or some number
cooperatively established with the drive manufacturers.)

2) the initial probe sequence that retrieves the drive's parameter pages
also pick up the SCT page and if ERC is enabled, adjust the timeout
downward.  I believe these capabilities should be reflected in sysfs for
use by udev.

3) mdadm should inspect member device ERC capabilities during creation
and assembly and enable it for drives that have it available but disabled.

In light of your maintainership notice, I will pursue this directly.

Phil

     prev parent reply	other threads:[~2015-12-21 12:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <404650428.13997384.1446132658661.JavaMail.zimbra@inria.fr>
2015-10-29 15:59 ` Reconstruct a RAID 6 that has failed in a non typical manner Clement Parisot
2015-10-30 18:31   ` Phil Turmel
2015-11-05 10:35     ` Clement Parisot
2015-11-05 13:34       ` Phil Turmel
2015-11-17 12:30         ` Marc Pinhede
2015-11-17 13:25           ` Phil Turmel
2015-12-21  3:40         ` NeilBrown
2015-12-21 12:20           ` Phil Turmel [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5677EE9B.7000607@turmel.org \
    --to=philip@turmel.org \
    --cc=clement.parisot@inria.fr \
    --cc=linux-raid@vger.kernel.org \
    --cc=nfbrown@novell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.