Re: RFC - de-clustered raid 60 or 61 algorithm

Linux RAID subsystem development
 help / color / mirror / Atom feed

From: Wol's lists <antlists@youngman.org.uk>
To: Phil Turmel <philip@turmel.org>, NeilBrown <neilb@suse.com>,
	mdraid <linux-raid@vger.kernel.org>
Subject: Re: RFC - de-clustered raid 60 or 61 algorithm
Date: Thu, 8 Feb 2018 23:10:14 +0000	[thread overview]
Message-ID: <81626593-c835-315f-3247-3019d81491a0@youngman.org.uk> (raw)
In-Reply-To: <e7c2c26c-c59b-ebab-423d-683a05ddfd8c@turmel.org>

On 08/02/18 12:56, Phil Turmel wrote:
> On 02/07/2018 10:14 PM, NeilBrown wrote:
>> On Thu, Feb 08 2018, Wol's lists wrote:
> 
>>> I've been playing with a mirror setup, and if we have two mirrors, we
>>> can rebuild any failed disk by coping from two other drives. I think
>>> also (I haven't looked at it) that you could do a fast rebuild without
>>> impacting other users of the system too much provided you don't swamp
>>> i/o bandwidth, as half of the requests for data on the three drives
>>> being used for rebuilding could actually be satisfied from other drives.
>>
>> I think that ends up being much the same result as a current raid10
>> where the number of copies doesn't divide the number of devices.
>> Reconstruction reads come from 2 different devices, and half the reads
>> that would go to them now go elsewhere.
> 
> This begs the question:
> 
> Why not just use the raid10,near striping algorithm?  Say one wants
> raid6 n=6 inside raid60 n=25.  Use the raid10,near6 n=25 striping
> algorithm, but within each near6 inner stripe place data and P and Q
> using the existing raid6 rotation.
> 
> What is the more complex placement algorithm providing?
> 
It came from the declustered thread.

Especially with raid-60, a rebuild will hammer a small subset of the 
drives in the array. The idea is that a more complex algorithm will 
spread the load across more drives. If your raid-6 in a raid-60 has say 
8 drives, a rebuild will stress 16 drives. If you've got 100 drives 
total, that's a lot of stress that could be avoided if the data could be 
more widely spread.

Thing is, you CAN gain a lot from a complex raid like raid-60 which you 
lose with a raid-6+0 - again something that came up was you have to 
scrub a raid-6+0 as a whole bunch of separate arrays.

Really, it's a case of the more we can spread the data, it (1) reduces 
the stress during a rebuild, thus reducing the risk of a second related 
failure, and (2) it increases the chances of surviving a multiple drive 
failure because if three logically related drives fail you've lost your 
raid-6-based array. Spreading the data reduces the logical linking 
between drives.

Cheers,
Wol

next prev parent reply	other threads:[~2018-02-08 23:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-08  0:46 RFC - de-clustered raid 60 or 61 algorithm Wol's lists
2018-02-08  3:14 ` NeilBrown
2018-02-08 12:56   ` Phil Turmel
2018-02-08 23:10     ` Wol's lists [this message]
2018-02-09 23:12   ` Wol's lists
2018-02-10  3:02   ` John Stoffel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=81626593-c835-315f-3247-3019d81491a0@youngman.org.uk \
    --to=antlists@youngman.org.uk \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=philip@turmel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox