linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Murphy <lists@colorremedies.com>
To: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Uncorrectable errors on RAID-1?
Date: Tue, 23 Dec 2014 15:09:07 -0700	[thread overview]
Message-ID: <CAJCQCtTK2OinQZakArtX4BD=JAz6UJr=dhc2xUbcqJJL1v6Kkw@mail.gmail.com> (raw)
In-Reply-To: <20141223211605.GC436@hungrycats.org>

On Tue, Dec 23, 2014 at 2:16 PM, Zygo Blaxell <zblaxell@furryterror.org> wrote:
> On Sun, Dec 21, 2014 at 05:25:47PM -0700, Chris Murphy wrote:
>> For the kernel to automatically fix
>> bad sectors by overwriting them, the drive needs to explicitly report
>> read errors. If the SCSI command timer value is shorter than the
>> drive's error recovery, the SATA link might get reset before the drive
>> reports the read error and then uncorrected errors will persist
>> instead of being automatically fixed.
>
> Is there a way to tell the kernel to go ahead and assume that all timeouts
> are effectively read errors?

The timer in /sys is a kernel command timer, it's not a device timer
even though it's pointed at a block device. You need to change that
from 30 to something higher to get the behavior you want. It doesn't
really make sense to say, timeout in 30 seconds, but instead of
reporting a timeout, report it as a read error. They're completely
different things.

There are all sorts of errors listed in libata so for all of them to
get dumped into a read error doesn't make sense. A lot of those errors
don't report back a sector, and the key part of the read error is what
sector(s) have the problem so that they can be fixed. Without that
information, the ability to fix it is lost. And it's the drive that
needs to report this.


> For a simple non-removable hard disk (i.e.
> not removable and not optical), that seems like a reasonable workaround
> for an assortment of firmware brokenness.

Oven doesn't work, so lets spray gasoline on it and light it and the
kitchen on fire so that we can cook this damn pizza! That's what I
just read. Sorry. It doesn't seem like a good idea to me to map all
errors as read errors.


> I just did a quick survey of random drives here and found less than 10%
> support "smartctl -l scterc".  A lot of server drives (or at least the
> drives that shipped in servers) don't have it, but laptop drives do.
> Drives with firmware that has horrifying known bugs do also have this
> feature.  :-P

Any decent server SATA drive should support SCT ERC. The inexpensive
WDC Red drives for NAS's all have it and by default are a reasonable
70 deciseconds last time I checked.

It might be that you're using SAS drives? In that case they may have
something different than SCT ERC that serves the same purpose, but I
don't have any SAS drives here to check this. I'd expect any SAS drive
already has short error recoveries by default, but that expectation
might be flawed.

Chris Murphy

  reply	other threads:[~2014-12-23 22:09 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-21 19:34 Uncorrectable errors on RAID-1? constantine
2014-12-21 21:56 ` Robert White
2014-12-21 22:17   ` Hugo Mills
2014-12-22  0:25 ` Chris Murphy
2014-12-23 21:16   ` Zygo Blaxell
2014-12-23 22:09     ` Chris Murphy [this message]
2014-12-23 22:23       ` Chris Murphy
2014-12-28  3:12       ` Phillip Susi
2014-12-29 21:53         ` Chris Murphy
2014-12-30 20:46           ` Phillip Susi
2014-12-30 23:58             ` Chris Murphy
2014-12-31  3:16               ` Phillip Susi
2015-01-03  5:31                 ` Chris Murphy
2015-01-05  4:18                   ` Phillip Susi
2015-01-05  7:41                     ` Chris Murphy
2014-12-31 15:40           ` Austin S Hemmelgarn
     [not found] ` <CAJCQCtQYhaDEic5bwd+PEcEfwOqLwAe8cT8VPZ9je+JLRP1GPw@mail.gmail.com>
2014-12-22 14:28   ` constantine
2014-12-22 16:05     ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJCQCtTK2OinQZakArtX4BD=JAz6UJr=dhc2xUbcqJJL1v6Kkw@mail.gmail.com' \
    --to=lists@colorremedies.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).