From: David Brown <david.brown@hesbynett.no>
To: Reindl Harald <h.reindl@thelounge.net>,
Adam Goryachev <mailinglists@websitemanagers.com.au>,
Jeff Allison <jeff.allison@allygray.2y.net>
Cc: linux-raid@vger.kernel.org
Subject: Re: proactive disk replacement
Date: Tue, 21 Mar 2017 15:15:52 +0100 [thread overview]
Message-ID: <58D13598.50403@hesbynett.no> (raw)
In-Reply-To: <09f4c794-8b17-05f5-10b7-6a3fa515bfa9@thelounge.net>
On 21/03/17 14:24, Reindl Harald wrote:
>
>
> Am 21.03.2017 um 14:13 schrieb David Brown:
>> On 21/03/17 12:03, Reindl Harald wrote:
>>>
>>> Am 21.03.2017 um 11:54 schrieb Adam Goryachev:
>> <snip>
>>>
>>>> In addition, you claim that a drive larger than 2TB is almost certainly
>>>> going to suffer from a URE during recovery, yet this is exactly the
>>>> situation you will be in when trying to recover a RAID10 with member
>>>> devices 2TB or larger. A single URE on the surviving portion of the
>>>> RAID1 will cause you to lose the entire RAID10 array. On the other
>>>> hand,
>>>> 3 URE's on the three remaining members of the RAID6 will not cause more
>>>> than a hiccup (as long as no more than one URE on the same stripe,
>>>> which
>>>> I would argue is ... exceptionally unlikely).
>>>
>>> given that when your disks have the same age errors on another disk
>>> become more likely when one failed and the heavy disk IO due recovery of
>>> a RAID6 with takes *many hours* where you have heavy IO on *all disks*
>>> compared with a way faster restore of RAID1/10 guess in which case a URE
>>> is more likely
>>>
>>> additionally why should the whole array fail just because a single block
>>> get lost? the is no parity which needs to be calculated, you just lost a
>>> single block somewhere - RAID1/10 are way easier in their implementation
>>
>> If you have RAID1, and you have an URE, then the data can be recovered
>> from the other have of that RAID1 pair. If you have had a disk failure
>> (manual for replacement, or a real failure), and you get an URE on the
>> other half of that pair, then you lose data.
>>
>> With RAID6, you need an additional failure (either another full disk
>> failure or an URE in the /same/ stripe) to lose data. RAID6 has higher
>> redundancy than two-way RAID1 - of this there is /no/ doubt
>
> yes, but with RAID5/RAID6 *all disks* are involved in the rebuild, with
> a 10 disk RAID10 only one disk needs to be read and the data written to
> the new one - all other disks are not involved in the resync at all
True...
>
> for most arrays the disks have a similar age and usage pattern, so when
> the first one fails it becomes likely that it don't take too long for
> another one and so load and recovery time matters
False. There is no reason to suspect that - certainly not to within the
hours or day it takes to rebuild your array. Disk failure pattern shows
a peak within the first month or so (failures due to manufacturing or
handling), then a very low error rate for a few years, then a gradually
increasing rate after that. There is not a very significant correlation
between drive failures within the same system, nor is there a very
significant correlation between usage and failures. It might seem
reasonable to suspect that a drive is more likely to fail during a
rebuild since the disk is being heavily used, but that does not appear
to be the practice. You will /spot/ more errors at that point - simply
because you don't see errors in parts of the disk that are not read -
but the rebuilding does not cause them.
And even if it /were/ true, then the key point is if there is an error
that causes data loss. An error during reading for a RAID1 rebuild
means lost data. An error during reading for a RAID6 rebuild means you
have to read an extra sector from another disk and correct the mistake.
next prev parent reply other threads:[~2017-03-21 14:15 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-20 12:47 proactive disk replacement Jeff Allison
2017-03-20 13:25 ` Reindl Harald
2017-03-20 14:59 ` Adam Goryachev
2017-03-20 15:04 ` Reindl Harald
2017-03-20 15:23 ` Adam Goryachev
2017-03-20 16:19 ` Wols Lists
2017-03-21 2:33 ` Jeff Allison
2017-03-21 9:54 ` Reindl Harald
2017-03-21 10:54 ` Adam Goryachev
2017-03-21 11:03 ` Reindl Harald
2017-03-21 11:34 ` Andreas Klauer
2017-03-21 12:03 ` Reindl Harald
2017-03-21 12:41 ` Andreas Klauer
2017-03-22 4:16 ` NeilBrown
2017-03-21 11:56 ` Adam Goryachev
2017-03-21 12:10 ` Reindl Harald
2017-03-21 13:13 ` David Brown
2017-03-21 13:24 ` Reindl Harald
2017-03-21 14:15 ` David Brown [this message]
2017-03-21 15:25 ` Wols Lists
2017-03-21 15:41 ` David Brown
2017-03-21 16:49 ` Phil Turmel
2017-03-22 13:53 ` Gandalf Corvotempesta
2017-03-22 14:12 ` David Brown
2017-03-22 14:32 ` Phil Turmel
2017-03-21 11:55 ` Gandalf Corvotempesta
2017-03-21 13:02 ` David Brown
2017-03-21 13:26 ` Gandalf Corvotempesta
2017-03-21 14:26 ` David Brown
2017-03-21 15:31 ` Wols Lists
2017-03-21 17:00 ` Phil Turmel
2017-03-21 15:29 ` Wols Lists
2017-03-21 16:55 ` Phil Turmel
2017-03-22 14:51 ` John Stoffel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58D13598.50403@hesbynett.no \
--to=david.brown@hesbynett.no \
--cc=h.reindl@thelounge.net \
--cc=jeff.allison@allygray.2y.net \
--cc=linux-raid@vger.kernel.org \
--cc=mailinglists@websitemanagers.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).