From: Phil Turmel <philip@turmel.org>
To: David Brown <david.brown@hesbynett.no>
Cc: Roman Mamedov <rm@romanrm.ru>,
Miles Fidelman <mfidelman@meetinghouse.net>,
linux-raid@vger.kernel.org
Subject: Re: 3TB drives failure rate
Date: Mon, 29 Oct 2012 09:02:32 -0400 [thread overview]
Message-ID: <508E7E68.1030202@turmel.org> (raw)
In-Reply-To: <508E3626.8080404@hesbynett.no>
On 10/29/2012 03:54 AM, David Brown wrote:
> On 29/10/2012 05:29, Roman Mamedov wrote:
>> On Sun, 28 Oct 2012 20:09:06 -0400
>> Miles Fidelman <mfidelman@meetinghouse.net> wrote:
>>
>>> Two separate issues.
>>
>>> The comments about "dropping out of raid" had to do with drives that are
>>> slow to come out of sleep mode - causing hiccups when the RAID
>>> hardware/software simply doesn't see the drive, and drops it.
>>
>> There are no drives in good working order that would come out of sleep
>> mode SO
>> slowly, that the Linux kernel ATA subsystem would even give up trying and
>> return an I/O error from it (and it's only after that point, when this
>> begins
>> to become mdraid's concern).
>>
>> I have yet to see even any first sign of "SATA frozen" due to drive sleep
>> mode, let alone to imagine this last through all the port resets and
>> speed
>> step-downs the SATA driver will attempt.
>>
>> So the "sleep" issue is not relevant with Linux software RAID, and if
>> you're
>> still concerned that it might be, you can just reconfigure your drives
>> so they
>> don't enter that sleep mode.
>>
>
> The same applies to the long retry times of "desktop" drives - Linux
> software raid has no problem with them. Some (perhaps "many" or "all" -
> I don't have the experience with hardware raid cards to say) hardware
> raid cards see long read retries as a timeout on the disk, and will drop
> the whole disk from the array.
Not true. The default linux controller timeout is 30 seconds. Drives
that spend longer than the timeout in recovery will be reset. If they
don't respond to the reset (because they're busy in recovery) when the
raid tries to write the correct data back to them, they will be kicked
out of the array.
Been there, done that, have the tee shirt.
> Linux md raid will wait for the data to come in, and use it if it is
> valid. If the disk returns an error, the md layer will re-create the
> data from the other disks, then re-write the bad block. The disk will
> then re-locate the bad block to one of its spare blocks, and everything
> should be fine. (If the write also fails, the drive gets kicked out.)
Precisely, except for the wait. It won't wait that long unless you
change the default driver timeout. The Seagate drives that did this to
me were kicked of the array because they were still stuck on recovery
when the write was commanded.
The drives were (and still are) just fine. They had UREs that needed to
be rewritten. When I later wiped the drives, they remained
relocation-free. They are now in solo duty as off-site backups.
> So with software raid, there are no problems using desktop drives of any
> sort in your array (assuming, of course, you don't have physical issues
> such as heat generation, vibration, support contracts, etc., that might
> otherwise make you prefer "raid" or "enterprise" drives).
Simply not true, in my experience. You *must* set ERC shorter than the
timeout, or set the driver timeout longer than the drive's worst-case
recovery time. The defaults for desktop drives are *not* suitable for
linux software raid.
Phil
next prev parent reply other threads:[~2012-10-29 13:02 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-28 12:15 3TB drives failure rate Rainer Fügenstein
2012-10-28 12:19 ` Mathias Burén
2012-10-28 12:49 ` John Robinson
2012-10-28 12:54 ` Michael Tokarev
2012-10-28 16:47 ` Ed W
2012-10-28 17:05 ` Joe Landman
2012-10-28 22:12 ` joystick
2012-10-28 22:24 ` Miles Fidelman
2012-10-28 23:59 ` joystick
2012-10-29 0:09 ` Miles Fidelman
2012-10-29 4:29 ` Roman Mamedov
2012-10-29 7:54 ` David Brown
2012-10-29 13:02 ` Phil Turmel [this message]
2012-10-30 23:54 ` 3TB drives failure rate (summary) Rainer Fügenstein
2012-10-31 12:35 ` Phil Turmel
2012-11-01 15:13 ` Miles Fidelman
2012-11-01 15:24 ` John Robinson
2012-11-01 15:39 ` Miles Fidelman
2012-11-01 16:05 ` John Robinson
2012-11-01 16:25 ` Miles Fidelman
2013-02-05 17:43 ` Adam Goryachev
2013-02-05 18:08 ` Roy Sigurd Karlsbakk
2013-02-05 20:34 ` Wolfgang Denk
2012-10-29 13:26 ` 3TB drives failure rate Miles Fidelman
2012-10-28 19:50 ` Chris Murphy
2012-10-28 19:59 ` Roman Mamedov
2012-10-28 20:10 ` Chris Murphy
2012-10-28 20:16 ` Roman Mamedov
2012-10-28 20:34 ` Chris Murphy
2012-10-28 20:49 ` Roman Mamedov
2012-10-28 20:59 ` Chris Murphy
2012-10-28 21:07 ` Miles Fidelman
2012-10-28 20:50 ` Chris Murphy
2012-10-28 21:07 ` Chris Murphy
2012-10-28 21:18 ` Roman Mamedov
2012-10-28 21:24 ` Mikael Abrahamsson
2012-10-28 21:45 ` Miles Fidelman
2012-10-28 22:35 ` Chris Murphy
2012-10-28 21:51 ` Chris Murphy
2012-10-28 21:59 ` joystick
2012-10-28 22:10 ` Phil Turmel
2012-10-29 0:12 ` joystick
2012-10-29 0:21 ` Phil Turmel
2012-10-29 0:27 ` Chris Murphy
2012-10-28 21:21 ` Mikael Abrahamsson
2012-10-28 23:51 ` Peter Kieser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=508E7E68.1030202@turmel.org \
--to=philip@turmel.org \
--cc=david.brown@hesbynett.no \
--cc=linux-raid@vger.kernel.org \
--cc=mfidelman@meetinghouse.net \
--cc=rm@romanrm.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.