linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: keld@keldix.com
To: NeilBrown <neilb@suse.de>
Cc: Gionatan Danti <g.danti@assyoma.it>, linux-raid@vger.kernel.org
Subject: Re: RAID 10 far and offset on-disk layouts
Date: Tue, 14 Jan 2014 00:38:34 +0100	[thread overview]
Message-ID: <20140113233834.GA8885@www5.open-std.org> (raw)
In-Reply-To: <20140114092751.09464b7b@notabene.brown>

On Tue, Jan 14, 2014 at 09:27:51AM +1100, NeilBrown wrote:
> On Mon, 13 Jan 2014 11:15:13 +0100 Gionatan Danti <g.danti@assyoma.it> wrote:
> 
> > On 01/13/2014 10:45 AM, NeilBrown wrote:
> > > On Mon, 13 Jan 2014 09:52:50 +0100 Gionatan Danti <g.danti@assyoma.it> wrote:
> > >
> > >> Hi Neil,
> > >> let me recap from a previous message:
> > >>
> > >>   >FAR LAYOUT
> > >>   >md(4) states:
> > >>   >"The first copy of all data blocks will be striped across the early >part
> > >>   >of all drives in RAID0 fashion, and then the next copy of all blocks
> > >>   >will be striped across a later section of all drives, always ensuring
> > >>   >that all copies of any given block are on different drives"
> > >>   >
> > >>   >The "on different drives" part let me wonder _how_ are chunks
> > >>   >distributed. On a 4-disk array, I can imagine some different schemas:
> > >>   >
> > >>   >1)	A1 A2 A3 A4
> > >>   >	.. .. .. ..
> > >>   >	A4 A1 A2 A3
> > >>   >
> > >>   >2)	A1 A2 A3 A4
> > >>   >	.. .. .. ..
> > >>   >	A2 A1 A4 A3
> > >>   >
> > >>   >The first schema is the one depicted by SuSe documentation [1], while
> > >>   >the second is the one described by Wikipedia [2].
> > >>   >
> > >>   >Question 1: as the two schema have different reliability
> > >>   >characteristics, which is really used?
> > >>
> > >> SuSe entry:
> > >> https://www.suse.com/documentation/sles11/stor_admin/data/raidmdadmr10cpx.html#b7cynnk
> > >>
> > >> Wikipedia entry:
> > >> http://en.wikipedia.org/wiki/Linux_MD_RAID_10#LINUX-MD-RAID-10 (see how
> > >> far layout is depicted)
> > >>
> > >> Keld kindly told me that the SuSe is simply not updated, as it depict a
> > >> situation changed with newer kernels. So my two questions:
> > >
> > > I cannot see an important difference between the two pages you reference.
> > > Both appear to be correct.
> > 
> > Mmm... they seem different to me.
> > 
> > SeSe FAR Layout:
> > 
> > sda1 sdb1 sdc1 sde1
> >    0    1    2    3
> >    4    5    6    7
> >    . . .
> >    3    0    1    2
> >    7    4    5    6
> > 
> > Notice how (for example) sdb1 is coupled both to sda1 (0,4) and 
> > sdc1(1,5). If sdb1 fails, any sda1 or sdc1 failure lead to data loss.
> > 
> > Now, Wikipedia FAR Layout:
> > 
> > 4 drives (sda1, sdb1, sdc1, sdd1)
> > --------------------
> > A1   A2   A3   A4
> > A5   A6   A7   A8
> > A9   A10  A11  A12
> > ..   ..   ..   ..
> > A2   A1   A4   A3
> > A6   A5   A8   A7
> > A10  A9   A12  A11
> > ..   ..   ..   ..
> > 
> > Notice now how a single disk (eg: sdb1) is coupled to only another 
> > _single_ disk (eg: sda1). In this case, if sdb1 fails, you had to lose 
> > sda1 to have a data loss. Losing sdc1 or sdd1 will _not_ lead to data loss.
> > 
> 
> Thanks for being explicit - it is much easier to answer explicit questions :-)
> 
> Yes, they are different.  So the wikipedia article is wrong, or at least
> misleading.  That is not what the "f2" layout looks like.
> 
> The md driver does support that layout.  I don't know yet what mdadm will
> call it, but it won't be called "f2".
> 
> So this change:
> 
> http://en.wikipedia.org/w/index.php?title=Non-standard_RAID_levels&diff=501908270&oldid=501604733
> 
> was wrong.

Well, it was me doing the wikipedia edit. The edit was done based on information from Neil that this was actually 
the layout. Then later we found out that it really was not, but it should be; and then Neil implemented
the better layout.  Maybe it is not called "f2", I look forward to be informed what the actual name 
will be. 

I think the name should be "f2" as it is a "far" layout, with 2 copies, and it really should be
the default for "far" with 2 copies, as the redundancy is much better than the old layout.
Keeping the name would mean that  we would not need to make and spread documentation on this,
so that people following existing documentation would automatically get the better implementation.
There is no need that new raid instances of "far" should get the old layout, except for
backwards compatibility. 

Best regards
keld



  reply	other threads:[~2014-01-13 23:38 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-27 14:29 RAID 10 far and offset on-disk layouts Gionatan Danti
2013-12-27 14:46 ` Peter Grandi
2013-12-27 15:16 ` Gionatan Danti
2013-12-27 17:16   ` Peter Grandi
2013-12-27 17:32     ` Gionatan Danti
2013-12-27 18:26       ` keld
2013-12-27 15:19 ` keld
2013-12-27 15:22   ` Gionatan Danti
2013-12-27 15:49     ` keld
2014-01-09  8:03       ` Gionatan Danti
2014-01-12 23:20         ` NeilBrown
2014-01-13  8:52           ` Gionatan Danti
2014-01-13  9:45             ` NeilBrown
2014-01-13 10:15               ` Gionatan Danti
2014-01-13 22:27                 ` NeilBrown
2014-01-13 23:38                   ` keld [this message]
2014-01-14  0:46                     ` Stan Hoeppner
2014-01-14  9:38                       ` keld
2014-01-14  9:06                   ` Gionatan Danti
2014-01-14  9:16                     ` NeilBrown
2014-01-14  9:27                       ` Gionatan Danti
2014-01-14 10:06               ` keld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140113233834.GA8885@www5.open-std.org \
    --to=keld@keldix.com \
    --cc=g.danti@assyoma.it \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).