From: Hugo Mills <hugo@carfax.org.uk>
To: Jim Salter <jim@jrs-s.net>
Cc: linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1?
Date: Thu, 13 Feb 2014 16:21:40 +0000 [thread overview]
Message-ID: <20140213162140.GW6490@carfax.org.uk> (raw)
In-Reply-To: <52FCEF46.3070306@jrs-s.net>
[-- Attachment #1: Type: text/plain, Size: 3144 bytes --]
On Thu, Feb 13, 2014 at 11:13:58AM -0500, Jim Salter wrote:
> This might be a stupid question but...
>
> Are there any plans to make parity RAID levels in btrfs similar to
> the current implementation of btrfs-raid1?
Yes.
> It took me a while to realize how different and powerful btrfs-raid1
> is from traditional raid1. The ability to string together virtually
> any combination of "mutt" hard drives together in arbitrary ways and
> yet maintain redundancy is POWERFUL, and is seriously going to be a
> killer feature advancing btrfs adoption in small environments.
>
> The one real drawback to btrfs-raid1 is that you're committed to n/2
> storage efficiency, since you're using pure redundancy rather than
> parity on the array. I was thinking about that this morning, and
> suddenly it occurred to me that you ought to be able to create a
> striped parity array in much the same way as a btrfs-raid1 array.
>
> Let's say you have five disks, and you arbitrarily want to define a
> stripe length of four data blocks plus one parity block per
> "stripe". Right now, what you're looking at effectively amounts to
> a RAID3 array, like FreeBSD used to use. But, what if we add two
> more disks? Or three more disks? Or ten more? Is there any reason
> we can't keep our stripe length of four blocks + one parity block,
> and just distribute them relatively ad-hoc in the same way
> btrfs-raid1 distributes redundant data blocks across an ad-hoc array
> of disks?
None whatsoever.
> This could be a pretty powerful setup IMO - if you implemented
> something like this, you'd be able to arbitrarily define your
> storage efficiency (percentage of parity blocks / data blocks) and
> your fault-tolerance level (how many drives you can afford to lose
> before failure) WITHOUT tying it directly to your underlying disks,
> or necessarily needing to rebalance as you add more disks to the
> array. This would be a heck of a lot more flexible than ZFS'
> approach of adding more immutable vdevs.
>
> Please feel free to tell me why I'm dumb for either 1. not realizing
> the obvious flaw in this idea or 2. not realizing it's already being
> worked on in exactly this fashion. =)
The latter. :)
One of the (many) existing problems with the parity RAID
implementation as it is is that with large numbers of devices, it
becomes quite inefficient to write data when you (may) need to modify
dozens of devices. It's been Chris's stated intention for a while now
to allow a bound to be placed on the maximum number of devices per
stripe, which allows a degree of control over the space-yield <->
performance knob.
It's one of the reasons that the usage tool [1] has a "maximum
stripes" knob on it -- so that you can see the behaviour of the FS
once that feature's in place.
Hugo.
[1] http://carfax.org.uk/btrfs-usage/
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- Nothing right in my left brain. Nothing left in ---
my right brain.
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
next prev parent reply other threads:[~2014-02-13 16:22 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-13 16:13 btrfs-RAID(3 or 5/6/etc) like btrfs-RAID1? Jim Salter
2014-02-13 16:21 ` Hugo Mills [this message]
2014-02-13 16:32 ` Jim Salter
2014-02-13 18:23 ` Hugo Mills
2014-02-13 20:22 ` Goffredo Baroncelli
2014-02-13 20:52 ` Hugo Mills
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140213162140.GW6490@carfax.org.uk \
--to=hugo@carfax.org.uk \
--cc=jim@jrs-s.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).