All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phillip Susi <psusi@ubuntu.com>
To: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: scrub implies failing drive - smartctl blissfully unaware
Date: Tue, 18 Nov 2014 15:58:18 -0500	[thread overview]
Message-ID: <546BB2EA.5080809@ubuntu.com> (raw)
In-Reply-To: <47FB8035-FEA6-40E1-9672-5BBF92B283A9@colorremedies.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11/18/2014 1:57 PM, Chris Murphy wrote:
> So a.) use smartctl -l scterc to change the value below 30 seconds 
> (300 deciseconds) with 70 deciseconds being reasonable. If the
> drive doesn’t support SCT commands, then b.) change the linux scsi
> command timer to be greater than 120 seconds.
> 
> Strictly speaking the command timer would be set to a value that 
> ensures there are no link reset messages in dmesg, that it’s long 
> enough that the drive itself times out and actually reports a read 
> error. This could be much shorter than 120 seconds. I don’t know
> if there are any consumer drives that try longer than 2 minutes to 
> recover data from a marginally bad sector.

Are there really any that take longer than 30 seconds?  That's enough
time for thousands of retries.  If it can't be read after a dozen
tries, it ain't never gonna work.  It seems absurd that a drive would
keep trying for so long.

> Ideally though, don’t use drives that lack SCT support in multiple 
> device volume configurations. An up to 2 minute hang of the
> storage stack isn’t production compatible for most workflows.

Wasn't there an early failure flag that md ( and therefore, btrfs when
doing raid ) sets so the scsi stack doesn't bother with recovery
attempts and just fails the request?  Thus if the drive takes longer
than the scsi_timeout, the failure would be reported to btrfs, which
then can recover using the other copy, write it back to the bad drive,
and hopefully that fixes it?

In that case, you probably want to lower the timeout so that the
recover kicks in sooner instead of hanging your IO stack for 30 seconds.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iQEcBAEBAgAGBQJUa7LqAAoJEI5FoCIzSKrw2Y0H/3Q03vCTxXeGkqvOYG/arZgk
yHq/ruWIKMgfaESdu0Ujzoqbe7XopUueU8luKon52LtbgIFhOM5XnMu/o52KPXIS
CVLnNtRWNbykHJMQu0Sk4lpPrUVI5QP9Ya9ZGVFM4x2ehvJGDAT+wcRWP5OH0waf
mgK+oOnadsckqiSbcQhGrxecjTWZFu5WUCzWFPx+4sEV5ta/tmL0obhHcyho+SDN
lCib2KI9YGzS2sm+V/Qe2i/3ZMp8QY8aAD2x/KlV0DBxkRLZQdOoD3ZkBiaApxZX
VMfXNCKLMexwpe+rGGemH/fCvhRpM/z1aHu8D1u4QVnoWPzD51vX7ySLkwRHaGo=
=XZkM
-----END PGP SIGNATURE-----

  reply	other threads:[~2014-11-18 20:58 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E1XqYMg-0000YI-8y@watricky.valid.co.za>
2014-11-18  7:29 ` scrub implies failing drive - smartctl blissfully unaware Brendan Hide
2014-11-18  7:36   ` Roman Mamedov
2014-11-18 13:24     ` Brendan Hide
2014-11-18 15:16       ` Duncan
2014-11-18 12:08   ` Austin S Hemmelgarn
2014-11-18 13:25     ` Brendan Hide
2014-11-18 16:02     ` Phillip Susi
2014-11-18 15:35   ` Marc MERLIN
2014-11-18 16:04     ` Phillip Susi
2014-11-18 16:11       ` Marc MERLIN
2014-11-18 16:26         ` Phillip Susi
2014-11-18 18:57     ` Chris Murphy
2014-11-18 20:58       ` Phillip Susi [this message]
2014-11-19  2:40         ` Chris Murphy
2014-11-19 15:11           ` Phillip Susi
2014-11-20  0:05             ` Chris Murphy
2014-11-25 21:34               ` Phillip Susi
2014-11-25 23:13                 ` Chris Murphy
2014-11-26  1:53                   ` Rich Freeman
2014-12-01 19:10                   ` Phillip Susi
2014-11-28 15:02                 ` Patrik Lundquist
2014-11-19  2:46         ` Duncan
2014-11-19 16:07           ` Phillip Susi
2014-11-19 21:05             ` Robert White
2014-11-19 21:47               ` Phillip Susi
2014-11-19 22:25                 ` Robert White
2014-11-20 20:26                   ` Phillip Susi
2014-11-20 22:45                     ` Robert White
2014-11-21 15:11                       ` Phillip Susi
2014-11-21 21:12                         ` Robert White
2014-11-21 21:41                           ` Robert White
2014-11-22 22:06                           ` Phillip Susi
2014-11-19 22:33                 ` Robert White
2014-11-20 20:34                   ` Phillip Susi
2014-11-20 23:08                     ` Robert White
2014-11-21 15:27                       ` Phillip Susi
2014-11-20  0:25               ` Duncan
2014-11-20  2:08                 ` Robert White
2014-11-19 23:59             ` Duncan
2014-11-25 22:14               ` Phillip Susi
2014-11-28 15:55                 ` Patrik Lundquist
2014-11-21  4:58   ` Zygo Blaxell
2014-11-21  7:05     ` Brendan Hide
2014-11-21 12:55       ` Ian Armstrong
2014-11-21 17:45         ` Chris Murphy
2014-11-22  7:18           ` Ian Armstrong
2014-11-21 17:42       ` Zygo Blaxell
2014-11-21 18:06         ` Chris Murphy
2014-11-22  2:25           ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=546BB2EA.5080809@ubuntu.com \
    --to=psusi@ubuntu.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.