Re: Uncorrectable errors on RAID-1?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Phillip Susi <psusi@ubuntu.com>
To: Chris Murphy <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Uncorrectable errors on RAID-1?
Date: Sun, 04 Jan 2015 23:18:59 -0500	[thread overview]
Message-ID: <54AA10B3.3010900@ubuntu.com> (raw)
In-Reply-To: <CAJCQCtR5qZmD2RgfQMSUe9BLtUtimTyf-itBM1Y+9hiRftfwgg@mail.gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 01/03/2015 12:31 AM, Chris Murphy wrote:
> It's not a made to order hard drive industry. Maybe one day you'll
> be able to 3D print your own with its own specs.

And wookies did not live on endor.  What's your point?

> Sticking fingers in your ears doesn't change the fact there's a 
> measurable difference in support requirements.

Sure, just don't misrepresent one requirement for another.  Just
because I don't care about a warranty from the hardware manufacturer
does not mean I have no right to expect the kernel to perform
*reasonably* on that hardware.

> This is architecture astronaut territory.
> 
> The system only has a terrible response for two reasons: 1. The
> user spec'd the wrong hardware for the use case; 2. The distro
> isn't automatically leveraging existing ways to mitigate that user
> mistake by changing either SCT ERC on the drives, or the SCSI
> command timer for each block device.

No, it has terrible response because the kernel either waits an
unreasonable time or fails the drive and kicks it out of the array
instead of trying to repair it.  Blaming the user for not buying
better hardware is not an appropriate response for the kernel failing
so badly to handle commonly available hardware that doesn't behave in
the most ideal way.

> Now, even though that solution *might* mean long recoveries on 
> occasion, it's still better than link reset behavior which is what
> we have today because it causes the underlying problem to be fixed
> by md/dm/Btrfs once the read error is reported. But no distro has 
> implemented this $500 man hour solution. Instead you're suggesting
> a $500,000 fix that will take hundreds of man hours and end user
> testing to find all the edge cases. It's like, seriously, WTF?

Seriously?  Treating a timeout the same way you treat an unrecoverable
media error is no herculean task.

> Ok well I think that's hubris unless you're a hard drive engineer. 
> You're referring to how drives behaved over a decade ago, when bad 
> sectors were persistent rather than remapped, and we had to scan
> the drive at format time to build a map so the bad ones wouldn't be
> used by the filesystem.

Remapping has nothing to do with it: we are talking about *read*
errors, which do not trigger a remap.

> http://www.seagate.com/files/www-content/support-content/documentation/product-manuals/en-us/Enterprise/Savvio/Savvio%2015K.3/100629381e.pdf
>
>  That's a high end SAS drive. It's default is to retry up to 20
> times, which takes ~1.4 seconds, per sector. But also note how it
> says

20 retries on a 15,000 rpm drive only takes 80 milliseconds, not 1.4
seconds.  15,000 rpm / 60 seconds per minute = 250 rotations/retries
per second.

> Maybe you'd prefer seeing these big, cheap, "green" drives have 
> shorter ERC times, with a commensurate reality check with their 
> unrecoverable error rate, which right now is already two orders 
> magnitude higher than enterprise SAS drives. So what if this means 
> that rate is 3 or 4 orders magnitude higher?

20 retries vs. 200 retries does not reduce the URE rate by orders of
magnitude; more like 1% *maybe*.  200 vs 2000 makes no measurable
difference at all.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBCgAGBQJUqhCxAAoJENRVrw2cjl5RhDYH/RLbHXEPyjK4j6u33ElOyS5S
W5/nfiT1ZZjVAFxJwD0y/gt2L61hB1PQdlUjBm2NayExfCXn3sEuccAxvjMDrvsL
dFJOV8G/7GBbUfsD0uBustG5639QGc30bRzuiw/URT77zNf+T6+5SmTPSC3Oaj3j
fCcDdiKCwNcYiUF3/Q3gdh4XVI8wgoABHC2S/GqvRB+FmmqD6Yt6yG50TG5sPBzq
zSUSxWjOPwVinZOlPfCUCFr3buw+yzg5fclcvaNRStJM38gtK0UGgeIHFgCViHtN
0xNRCKWMu3XkfjfOI/cYVor79K4sQlz9K83Ja/UAMrOtopdlKjn9N04oIiPdsbg=
=u/i9
-----END PGP SIGNATURE-----

next prev parent reply	other threads:[~2015-01-05  4:19 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-21 19:34 Uncorrectable errors on RAID-1? constantine
2014-12-21 21:56 ` Robert White
2014-12-21 22:17   ` Hugo Mills
2014-12-22  0:25 ` Chris Murphy
2014-12-23 21:16   ` Zygo Blaxell
2014-12-23 22:09     ` Chris Murphy
2014-12-23 22:23       ` Chris Murphy
2014-12-28  3:12       ` Phillip Susi
2014-12-29 21:53         ` Chris Murphy
2014-12-30 20:46           ` Phillip Susi
2014-12-30 23:58             ` Chris Murphy
2014-12-31  3:16               ` Phillip Susi
2015-01-03  5:31                 ` Chris Murphy
2015-01-05  4:18                   ` Phillip Susi [this message]
2015-01-05  7:41                     ` Chris Murphy
2014-12-31 15:40           ` Austin S Hemmelgarn
     [not found] ` <CAJCQCtQYhaDEic5bwd+PEcEfwOqLwAe8cT8VPZ9je+JLRP1GPw@mail.gmail.com>
2014-12-22 14:28   ` constantine
2014-12-22 16:05     ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54AA10B3.3010900@ubuntu.com \
    --to=psusi@ubuntu.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.