From: Hugo Mills <hugo-lkml@carfax.org.uk>
To: Andreas Philipp <philipp.andreas@gmail.com>
Cc: Hugo Mills <hugo-lkml@carfax.org.uk>,
Bart Noordervliet <bart@noordervliet.net>,
Chris Ball <cjb@laptop.org>,
linux-btrfs@vger.kernel.org
Subject: Re: Update to Project_ideas wiki page
Date: Wed, 17 Nov 2010 18:34:18 +0000 [thread overview]
Message-ID: <20101117183418.GC2401@selene> (raw)
In-Reply-To: <4CE41B97.4070606@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3469 bytes --]
On Wed, Nov 17, 2010 at 07:14:47PM +0100, Andreas Philipp wrote:
> On 17.11.2010 18:56, Hugo Mills wrote:
> > On Wed, Nov 17, 2010 at 04:12:29PM +0100, Bart Noordervliet wrote:
> >> Can I suggest we combine this new RAID level management with a
> >> modernisation of the terminology for storage redundancy, as has been
> >> discussed previously in the "Raid1 with 3 drives" thread of March this
> >> year? I.e. abandon the burdened raid* terminology in favour of
> >> something that makes more sense for a filesystem.
> >
> > Well, our current RAID modes are:
> >
> > * 1 Copy ("SINGLE")
> > * 2 Copies ("DUP")
> > * 2 Copies, different spindles ("RAID1")
> > * 1 Copy, 2 Stripes ("RAID0")
> > * 2 Copies, 2 Stripes [each] ("RAID10")
> >
> > The forthcoming RAID5/6 code will expand on that, with
> >
> > * 1 Copy, n Stripes + 1 Parity ("RAID5")
> > * 1 Copy, n Stripes + 2 Parity ("RAID6")
> >
> > (I'm not certain how "n" will be selected -- it could be a config
> > option, or simply selected on the basis of the number of
> > spindles/devices currently in the FS).
> Just one question on "small n": If one has N = 3*k >= 6 spindles, then
> RAID5 with n = N/2-1 results in something like RAID50? So having an
> option for "small n" might realize RAID50 given the right choice for n.
I see what you're getting at, but actually, that would just be
RAID-5 with small n. It merely happens to spread chunks out over more
spindles than the minimum n+1 required to give you what you asked for.
(See the explanation below for why).
> > We could further postulate a RAID50/RAID60 mode, which would be
> >
> > * 2 Copies, n Stripes + 1 Parity
> > * 2 Copies, n Stripes + 2 Parity
> Isn't this RAID51/RAID61 (or 15/16 unsure on how to put) and would
> RAID50/RAID60 correspond to
Errr... yes, you're right. My mistake. Although... again, see the
conclusion below. :)
> * 2 Stripes, n Stripes + 1 Parity
> * 2 Stripes, n Stripes + 2 Parity
I'm not sure talking about RAID50-like things (as you state above)
makes much sense, given the internal data structures that btrfs uses:
As far as I know(*), data is firstly allocated in chunks of about
1GiB per device. Chunks are grouped together to give you replication.
So, for a RAID-0 or RAID-1 arrangement, chunks are allocated in pairs,
picked from different devices. For RAID-10, they're allocated in
quartets, again on different devices. For RAID-5, they'd be allocated
in groups of n+1. For RAID-61, we'd use 2n+4 chunks in an allocation.
For replication strategies where it matters (anything other than
DUP, SINGLE, RAID-1 so far), the chunks are then subdivided into
stripes of a fixed width. Data written to the disk is spread across
the stripes in an appropriate manner.
From this point of view, RAID50 and RAID51 look much the same,
unless the stripe size for the "5" is different to the stripe size for
the "0" or "1". I'm not sure that's the case. If the stripe sizes are
the same, you'll basically get the same layout of data across the 2n+2
chunks -- it's just that (possibly) the internal labels of the chunks
which indicate which bit of data they're holding in the pattern will
be different.
Hugo.
(*) I could be wrong, hopefully someone will correct me if so.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- A cross? Oy vey, have you picked the wrong vampire! ---
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]
next prev parent reply other threads:[~2010-11-17 18:34 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-17 3:19 Update to Project_ideas wiki page Chris Ball
2010-11-17 14:31 ` Hugo Mills
2010-11-17 15:12 ` Bart Noordervliet
2010-11-17 17:19 ` Xavier Nicollet
2010-11-17 17:52 ` Mike Fedyk
2010-11-17 17:56 ` Hugo Mills
2010-11-17 18:07 ` Gordan Bobic
2010-11-17 18:41 ` Bart Kus
2010-11-18 8:36 ` Gordan Bobic
2010-11-18 14:31 ` Bart Noordervliet
2010-11-18 15:02 ` Justin Ossevoort
2010-11-18 15:06 ` Gordan Bobic
2010-11-17 18:14 ` Andreas Philipp
2010-11-17 18:34 ` Hugo Mills [this message]
2010-11-26 14:57 ` Paul Komkoff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101117183418.GC2401@selene \
--to=hugo-lkml@carfax.org.uk \
--cc=bart@noordervliet.net \
--cc=cjb@laptop.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=philipp.andreas@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).