PATCH: RAID10-layout-descriptions

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* PATCH: RAID10-layout-descriptions
@ 2014-06-15 17:22 Christoph Anton Mitterer
  2014-06-15 20:43 ` NeilBrown
  2014-07-03  9:40 ` keld
  0 siblings, 2 replies; 5+ messages in thread
From: Christoph Anton Mitterer @ 2014-06-15 17:22 UTC (permalink / raw)
  To: linux-raid


[-- Attachment #1.1: Type: text/plain, Size: 1328 bytes --]

Hi Neil.

As mentioned on GitHub before I'm trying to clean up some old patches or
get them merged.


I lost a bit track on what we've discussed about them before,...

One thing I remember is that you didn't like unicode and tbl(1) being
used.

Well of course we can talk about that again,... but I think this is
2014, so literally everyone has unicode and I think the explanations
benefit from it (actually I see more and more manpages using unicode).

With respect to tbl(1) and the box drawings... I think you were
complaining that this doesn't work with groff when rendering e.g. to
PDF.... well I guess you're right, but the question is probably: is
anyone in the world doing this?
I mean for manpages it seems to work quite well and IMHO improves
readability and understandability of the explanations quite a lot... and
we can't just cover any side way on how the nroff files might be used,
and for which rendering doesn't work.


After all,... I think the patches below contain lots of valuable
information which is currently missing in the manpages... so having that
information merged somehow is surely better than not.
Actually the same is IMHO fully missing for the different RAID 5/6
layouts.

I'd be happy if someone could look into spelling issues and that like.

Cheers,
Chris.

[-- Attachment #1.2: 0001-revised-the-documentation-of-RAID10-layouts.patch --]
[-- Type: text/x-patch, Size: 12257 bytes --]

From 8c11f7153ff4e5b99ffbe107303afae53de19da2 Mon Sep 17 00:00:00 2001
From: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Date: Wed, 10 Jul 2013 16:03:11 +0200
Subject: [PATCH 1/5] revised the documentation of RAID10 layouts

* Completely revised the documentation of the RAID10 layouts, with examples for
  n2,f2,o2 with and odd and an even number of underlying devices.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
---
 md.4 | 337 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 314 insertions(+), 23 deletions(-)

diff --git a/md.4 b/md.4
index 2574c37..ced1d89 100644
--- a/md.4
+++ b/md.4
@@ -266,32 +266,323 @@ as RAID1+0.  Every datablock is duplicated some number of times, and
 the resulting collection of datablocks are distributed over multiple
 drives.
 
-When configuring a RAID10 array, it is necessary to specify the number
-of replicas of each data block that are required (this will normally
-be 2) and whether the replicas should be 'near', 'offset' or 'far'.
-(Note that the 'offset' layout is only available from 2.6.18).
+When configuring a RAID10 array, it is necessary to specify the number of
+replicas of each data block that are required (this will usually be\ 2) and
+whether their layout should be 'near', 'far' or 'offset' (only available since
+Linux\ 2.6.18).
 
-When 'near' replicas are chosen, the multiple copies of a given chunk
-are laid out consecutively across the stripes of the array, so the two
-copies of a datablock will likely be at the same offset on two
-adjacent devices.
 
+.TP
+.B About the RAID10 Layout Examples
+The examples below visualise the chunk distribution on the underlying devices
+for the respective layout.
+
+For simplicity it is assumed that the size of the chunks equals the size of the
+blocks of the underlying devices as well as those of the RAID10 device exported
+by the kernel (for example \fB/dev/md/\fPname).
+.br
+Therefore the chunks\ /\ chunk numbers map directly to the blocks\ /\ block
+addresses of the exported RAID10 device.
+
+Decimal numbers (0,\ 1, 2,\ …) are the chunks of the RAID10 and due to the above
+assumption also the blocks and block addresses of the exported RAID10 device.
+.br
+Same numbers mean copies of a chunk\ /\ block (obviously on different underlying
+devices).
+.br
+Hexadecimal numbers (0x00,\ 0x01, 0x02,\ …) are the block addresses of the
+underlying devices.
+.PP
+
+
+.TP
+.B 'near' Layout
+When 'near' replicas are chosen, the multiple copies of a given chunk are laid
+out consecutively (“as close to each other as possible”) across the stripes of
+the array.
+
+With an even number of devices, they will likely (unless some misalignment is
+present) lay at the very same offset on the different devices.
+.br
+This is as the “classic” RAID1+0; that is two groups of mirrored devices (in the
+example below the groups Device\ #1\ /\ #2 and Device\ #3\ /\ #4 are each a
+RAID1) both in turn forming a striped RAID0.
+
+.B Example with 2\ copies per chunk and an even number\ (4) of devices:
+.TS
+tab(;);
+  C   -   -   -   -
+  C | C | C | C | C |
+| - | - | - | - | - |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| - | - | - | - | - |
+  C   C   S   C   S
+  C   C   S   C   S
+  C   C   S   S   S
+  C   C   S   S   S.
+;
+;Device #1;Device #2;Device #3;Device #4
+0x00;0;0;1;1
+0x01;2;2;3;3
+⋯;⋯;⋯;⋯;⋯
+⋮;⋮;⋮;⋮;⋮
+⋯;⋯;⋯;⋯;⋯
+0x80;254;254;255;255
+;╰─────────┬─────────╯;╰─────────┬─────────╯
+;RAID1;RAID1
+;╰─────────────────────┬─────────────────────╯
+;RAID0
+.TE
+
+.B Example with 2\ copies per chunk and an odd number\ (5) of devices:
+.TS
+tab(;);
+  C   -   -   -   -   -
+  C | C | C | C | C | C |
+| - | - | - | - | - | - |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| - | - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4;Device #5
+0x00;0;0;1;1;2
+0x01;2;3;3;4;4
+⋯;⋯;⋯;⋯;⋯;⋯
+⋮;⋮;⋮;⋮;⋮;⋮
+⋯;⋯;⋯;⋯;⋯;⋯
+0x80;317;318;318;319;319
+;
+.TE
+.PP
+
+
+.TP
+.B 'far' Layout
 When 'far' replicas are chosen, the multiple copies of a given chunk
-are laid out quite distant from each other.  The first copy of all
-data blocks will be striped across the early part of all drives in
-RAID0 fashion, and then the next copy of all blocks will be striped
-across a later section of all drives, always ensuring that all copies
-of any given block are on different drives.
-
-The 'far' arrangement can give sequential read performance equal to
-that of a RAID0 array, but at the cost of reduced write performance.
-
-When 'offset' replicas are chosen, the multiple copies of a given
-chunk are laid out on consecutive drives and at consecutive offsets.
-Effectively each stripe is duplicated and the copies are offset by one
-device.   This should give similar read characteristics to 'far' if a
-suitably large chunk size is used, but without as much seeking for
-writes.
+are laid out quite distant (“as far as reasonably possible”) from each other.
+
+First a complete sequence of all data blocks (that is all the data one sees on
+the exported RAID10 block device) is striped over the devices. Then a another
+(though “shifted”) complete sequence of all data blocks; and so on (in the case
+of more than 2\ copies per chunk).
+
+The “shift” needed to prevent placing copies of the same chunks on the same
+devices is actually a cyclic permutation with offset\ 1 of each of the stripes
+within a complete sequence of chunks.
+.br
+The offset\ 1 is relative to the previous complete sequence of chunks, so in
+case of more than 2\ copies per chunk one gets the following offsets:
+.br
+1.\ complete sequence of chunks: offset\ ≔\ \ 0
+.br
+2.\ complete sequence of chunks: offset\ ≔\ \ 1
+.br
+3.\ complete sequence of chunks: offset\ ≔\ \ 2
+.br
+                       ⋮
+.br
+n.\ complete sequence of chunks: offset\ ≔\ n−1
+
+.B Example with 2\ copies per chunk and an even number\ (4) of devices:
+.TS
+tab(;);
+  C   -   -   -   -
+  C | C | C | C | C |
+| - | - | - | - | - |
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4
+;
+0x00;0;1;2;3;╮
+0x01;4;5;6;7;├ ▒
+⋯;⋯;⋯;⋯;⋯;┆
+⋮;⋮;⋮;⋮;⋮;┆
+⋯;⋯;⋯;⋯;⋯;┆
+0x40;252;253;254;255;╯
+0x41;3;0;1;2;╮
+0x42;7;4;5;6;├ ▒↻
+⋯;⋯;⋯;⋯;⋯;┆
+⋮;⋮;⋮;⋮;⋮;┆
+⋯;⋯;⋯;⋯;⋯;┆
+0x80;255;252;253;254;╯
+;
+.TE
+
+.B Example with 2\ copies per chunk and an odd number\ (5) of devices:
+.TS
+tab(;);
+  C   -   -   -   -   -
+  C | C | C | C | C | C |
+| - | - | - | - | - | - |
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| - | - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4;Device #5
+;
+0x00;0;1;2;3;4;╮
+0x01;5;6;7;8;9;├ ▒
+⋯;⋯;⋯;⋯;⋯;⋯;┆
+⋮;⋮;⋮;⋮;⋮;⋮;┆
+⋯;⋯;⋯;⋯;⋯;⋯;┆
+0x40;315;316;317;318;319;╯
+0x41;4;0;1;2;3;╮
+0x42;9;5;6;7;8;├ ▒↻
+⋯;⋯;⋯;⋯;⋯;⋯;┆
+⋮;⋮;⋮;⋮;⋮;⋮;┆
+⋯;⋯;⋯;⋯;⋯;⋯;┆
+0x80;319;315;316;317;318;╯
+;
+.TE
+
+With ▒\ being the complete sequence of chunks and ▒↻\ the cyclic permutation
+with offset\ 1 thereof (in the case of more than 2 copies per chunk there would
+be (▒↻)↻,\ ((▒↻)↻)↻,\ …).
+
+The advantage of this layout is that MD can easily spread sequential reads over
+the devices, making them similar to RAID0 in terms of speed.
+.br
+The cost is more seeking for writes, making them substantially slower.
+.PP
+
+
+.TP
+.B 'offset' Layout
+When 'offset' replicas are chosen, all the copies of a given chunk are striped
+conscutively (“offset by the stripe length after each other”) over the devices.
+
+Explained in detail, <number of devices> consecutive chunks are striped over the
+devices, immediately followed by a “shifted” copy of these chunks (and by
+further such “shifted” copies in the case of more than 2\ copies per chunk).
+.br
+This pattern repeates for all further consecutive chunks of the exported RAID10
+device (in other words: all further data blocks).
+
+The “shift” needed to prevent placing copies of the same chunks on the same
+devices is actually a cyclic permutation with offset\ 1 of each of the striped
+copies of <number of devices> consecutive chunks.
+.br
+The offset\ 1 is relative to the previous striped copy of <number of devices>
+consecutive chunks, so in case of more than 2\ copies per chunk one gets the
+following offsets:
+.br
+1.\ <number of devices> consecutive chunks: offset\ ≔\ \ 0
+.br
+2.\ <number of devices> consecutive chunks: offset\ ≔\ \ 1
+.br
+3.\ <number of devices> consecutive chunks: offset\ ≔\ \ 2
+.br
+                             ⋮
+.br
+n.\ <number of devices> consecutive chunks: offset\ ≔\ n−1
+
+.B Example with 2\ copies per chunk and an even number\ (4) of devices:
+.TS
+tab(;);
+  C   -   -   -   -
+  C | C | C | C | C |
+| - | - | - | - | - |
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4
+;
+0x00;0;1;2;3;) AA
+0x01;3;0;1;2;) AA↻
+0x02;4;5;6;7;) AB
+0x03;7;4;5;6;) AB↻
+⋯;⋯;⋯;⋯;⋯;) ⋯
+⋮;⋮;⋮;⋮;⋮;  ⋮
+⋯;⋯;⋯;⋯;⋯;) ⋯
+0x79;251;252;253;254;) EX
+0x80;254;251;252;253;) EX↻
+;
+.TE
+
+.B Example with 2\ copies per chunk and an odd number\ (5) of devices:
+.TS
+tab(;);
+  C   -   -   -   -   -
+  C | C | C | C | C | C |
+| - | - | - | - | - | - |
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| - | - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4;Device #5
+;
+0x00;0;1;2;3;4;) AA
+0x01;4;0;1;2;3;) AA↻
+0x02;5;6;7;8;9;) AB
+0x03;9;5;6;7;8;) AB↻
+⋯;⋯;⋯;⋯;⋯;⋯;) ⋯
+⋮;⋮;⋮;⋮;⋮;⋮;  ⋮
+⋯;⋯;⋯;⋯;⋯;⋯;) ⋯
+0x79;314;315;316;317;318;) EX
+0x80;318;314;315;316;317;) EX↻
+;
+.TE
+
+With AA,\ AB,\ …, AZ,\ BA,\ … being the sets of <number of devices> consecutive
+chunks and AA↻,\ AB↻,\ …, AZ↻,\ BA↻,\ … the cyclic permutations with offset\ 1
+thereof (in the case of more than 2 copies per chunk there would be (AA↻)↻,\ …
+as well as ((AA↻)↻)↻,\ … and so on).
+
+This should give similar read characteristics to 'far' if a suitably large chunk
+size is used, but without as much seeking for writes.
+.PP
+
 
 It should be noted that the number of devices in a RAID10 array need
 not be a multiple of the number of replica of each data block; however,
-- 
2.0.0


[-- Attachment #1.3: 0002-clarified-which-layout-is-available-since-when.patch --]
[-- Type: text/x-patch, Size: 908 bytes --]

From d052e3171dcc71dca15f72fe79076c843c9766ed Mon Sep 17 00:00:00 2001
From: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Date: Tue, 16 Jul 2013 16:52:49 +0200
Subject: [PATCH 2/5] clarified which layout is available since when

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
---
 md.4 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/md.4 b/md.4
index ced1d89..36486c9 100644
--- a/md.4
+++ b/md.4
@@ -268,8 +268,8 @@ drives.
 
 When configuring a RAID10 array, it is necessary to specify the number of
 replicas of each data block that are required (this will usually be\ 2) and
-whether their layout should be 'near', 'far' or 'offset' (only available since
-Linux\ 2.6.18).
+whether their layout should be 'near', 'far' or 'offset' (with 'offset' being
+available since Linux\ 2.6.18).
 
 
 .TP
-- 
2.0.0


[-- Attachment #1.4: 0003-demote-the-the-section-explaining-the-examples.patch --]
[-- Type: text/x-patch, Size: 994 bytes --]

From bcfdd68fc4c286d1b42ec8e306178292f01dd56f Mon Sep 17 00:00:00 2001
From: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Date: Tue, 16 Jul 2013 17:10:24 +0200
Subject: [PATCH 3/5] demote the the section explaining the examples

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
---
 md.4 | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/md.4 b/md.4
index 36486c9..76525a5 100644
--- a/md.4
+++ b/md.4
@@ -272,8 +272,8 @@ whether their layout should be 'near', 'far' or 'offset' (with 'offset' being
 available since Linux\ 2.6.18).
 
 
-.TP
-.B About the RAID10 Layout Examples
+.B About the RAID10 Layout Examples:
+.br
 The examples below visualise the chunk distribution on the underlying devices
 for the respective layout.
 
@@ -292,7 +292,6 @@ devices).
 .br
 Hexadecimal numbers (0x00,\ 0x01, 0x02,\ …) are the block addresses of the
 underlying devices.
-.PP
 
 
 .TP
-- 
2.0.0


[-- Attachment #1.5: 0004-process-tbl-code-in-nroff-for-md.4.patch --]
[-- Type: text/x-patch, Size: 733 bytes --]

From ea7e51226003cdd9cd782570f4fd1f4e19df683b Mon Sep 17 00:00:00 2001
From: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Date: Tue, 16 Jul 2013 17:30:07 +0200
Subject: [PATCH 4/5] process tbl code in nroff for md.4

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 167e02d..146c5ff 100644
--- a/Makefile
+++ b/Makefile
@@ -230,7 +230,7 @@ mdmon.man : mdmon.8
 	nroff -man mdmon.8 > mdmon.man
 
 md.man : md.4
-	nroff -man md.4 > md.man
+	nroff -man -t md.4 > md.man
 
 mdadm.conf.man : mdadm.conf.5
 	nroff -man mdadm.conf.5 > mdadm.conf.man
-- 
2.0.0


[-- Attachment #1.6: 0005-fix-some-typos.patch --]
[-- Type: text/x-patch, Size: 1395 bytes --]

From 9c203bca56ba4ac816befc529770129aa6306da3 Mon Sep 17 00:00:00 2001
From: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Date: Tue, 16 Jul 2013 17:34:23 +0200
Subject: [PATCH 5/5] fix some typos

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
---
 md.4 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/md.4 b/md.4
index 76525a5..8f1d3d4 100644
--- a/md.4
+++ b/md.4
@@ -482,13 +482,13 @@ The cost is more seeking for writes, making them substantially slower.
 .TP
 .B 'offset' Layout
 When 'offset' replicas are chosen, all the copies of a given chunk are striped
-conscutively (“offset by the stripe length after each other”) over the devices.
+consecutively (“offset by the stripe length after each other”) over the devices.
 
 Explained in detail, <number of devices> consecutive chunks are striped over the
 devices, immediately followed by a “shifted” copy of these chunks (and by
 further such “shifted” copies in the case of more than 2\ copies per chunk).
 .br
-This pattern repeates for all further consecutive chunks of the exported RAID10
+This pattern repeats for all further consecutive chunks of the exported RAID10
 device (in other words: all further data blocks).
 
 The “shift” needed to prevent placing copies of the same chunks on the same
-- 
2.0.0


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: PATCH: RAID10-layout-descriptions
  2014-06-15 17:22 PATCH: RAID10-layout-descriptions Christoph Anton Mitterer
@ 2014-06-15 20:43 ` NeilBrown
  2014-06-16 16:34   ` Christoph Anton Mitterer
  2014-07-03  9:40 ` keld
  1 sibling, 1 reply; 5+ messages in thread
From: NeilBrown @ 2014-06-15 20:43 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-raid

On Sun, 15 Jun 2014 19:22:23 +0200 Christoph Anton Mitterer
<calestyo@scientia.net> wrote:

> Hi Neil.
> 
> As mentioned on GitHub before I'm trying to clean up some old patches or
> get them merged.
> 
> 
> I lost a bit track on what we've discussed about them before,...
> 
> One thing I remember is that you didn't like unicode and tbl(1) being
> used.

tbl I can live with.  Unicode I cannot.
In some contexts Unicode may be ok (non-English words) but not for
line-drawing characters and not for special punctuation.  troff should be
able to create those characters itself and you should tell it what you want.

> 
> Well of course we can talk about that again,... but I think this is
> 2014, so literally everyone has unicode and I think the explanations
> benefit from it (actually I see more and more manpages using unicode).

It's not true that "literally everyone" has enough water to drink.
Suggesting they have Unicode is ridiculous.

> 
> With respect to tbl(1) and the box drawings... I think you were
> complaining that this doesn't work with groff when rendering e.g. to
> PDF.... well I guess you're right, but the question is probably: is
> anyone in the world doing this?

Google  for "manpages pdf"

There are certainly web sites that display man pages in HTML, so that is a
minimum requirement.
The correct approach is not "do what I think is cool" but "do the same sort
of think that all other man pages do to maximize interoperability".

> I mean for manpages it seems to work quite well and IMHO improves
> readability and understandability of the explanations quite a lot... and
> we can't just cover any side way on how the nroff files might be used,
> and for which rendering doesn't work.

The idea of following standards is that if everyone does, then you *do* cover
all conforming uses of the code.

Some of that "standard" is in "man 7 man-pages".  Some of it is simply
defacto.

( of 18508 man pages on my laptop, 333 contain Unicode.
 Some is in peoples names. Some are in month names for non-western
 calendar.
 There are occasional "copyright" characters which should be \*(co.

 mmcli.8  uses some unicode quotes, which is wrong.  I'm sure there are 
           troff special directives for that
 screen.1 does too
 systemd.time.7 and timedatectl  use an arrow instead of \[->]
 and several use accented characters and non-latin characters which is
 probably OK.
)
> 
> 
> After all,... I think the patches below contain lots of valuable
> information which is currently missing in the manpages... so having that
> information merged somehow is surely better than not.
> Actually the same is IMHO fully missing for the different RAID 5/6
> layouts.

Yes, the patches contain valuable improvements.  Unfortunately some are
blended with regressions which make them hard to use.
Others just fix problems which were introduced earlier in the series, which
is poor form.

When sending patches by email it is greatly preferred to send one patch per
email.  This makes it a lot easier to apply (using "git am"), and a lot
easier to reply to.

I may try to pick the usable bits out of the patches later in the month as I
do like some of the improvements.  But I'd rather not have to :-)

Thanks,
NeilBrown

> 
> I'd be happy if someone could look into spelling issues and that like.
> 
> Cheers,
> Chris.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH: RAID10-layout-descriptions
  2014-06-15 20:43 ` NeilBrown
@ 2014-06-16 16:34   ` Christoph Anton Mitterer
  2014-07-03  6:58     ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Anton Mitterer @ 2014-06-16 16:34 UTC (permalink / raw)
  To: linux-raid


[-- Attachment #1.1: Type: text/plain, Size: 3617 bytes --]

On Mon, 2014-06-16 at 06:43 +1000, NeilBrown wrote: 
> tbl I can live with.  Unicode I cannot.
> In some contexts Unicode may be ok (non-English words) but not for
> line-drawing characters and not for special punctuation.
Well but you know, that tbl(1) won't work for your PDF/HTML rendering
either? At least it didn't when I checked it.

> troff should be
> able to create those characters itself and you should tell it what you want.
Well I tried that (but I'm not a [n|t|g]roff expert... \[uXXXX]
sequences seem to work fine again for man... but PDF again gives me a
warning: can't find special character `uXXXX'

Actually it seems that at least groff should support plain UTF8
characters and convert them internally to uXXXX representations... (and
as one can see: it does for man)... not sure why it doesn't work with
PDF.


> It's not true that "literally everyone" has enough water to drink.
> Suggesting they have Unicode is ridiculous.
Well not sure what unicode has to do with water... ;-) ... I've looked
around a bit in Google and found entries with groff/Unicode already
dating back to at least 2006... that's ten years... anyone using very
old or embedded systems which don't support unicode yet, will likely
continue to have a very old version of mdadm manpages (without
unicode)... and those people who really update to a new mdadm (with my
unicode patches) but not to a new system supporting unicode can probably
live with looking at the docs at some newer system.

I mean I don't quite understand why the majority of users should suffer
(and I think using extended characters clearly improves readability)
just for a very small minority... and after all,... that's always some
kinda poor argument of not supporting the new (like Unicode here): "we
don't support it, since there are others as well who don't support it
yet"... kinda self sustaining..


> There are certainly web sites that display man pages in HTML, so that is a
> minimum requirement.
Well but then again tbl(1) kicks you out of the game. Now of course I
could "hard code / draw" what I do with tbl(1)... and that should work
then with PDF/HTML, when a fixed-width font is used,... but in turn one
looses at man, when the terminal has different sizes.

I think adding proper tbl/unicode support is rather a duty of the groff
guys... and actually there are some mailing list post, where some guy
claims to be working on HTML support for tbl.


> The correct approach is not "do what I think is cool" but "do the same sort
> of think that all other man pages do to maximize interoperability".
Well again,.. this will keep us in stone age forever,... and I think 
using tbl(1) is a much bigger problem then.
Unicode is THE standard charset nowadays... everything should support
it... if not it's a bug,... especially when looking at
internationalisation it's simply needed... 



Anyway... I guess there's no benefit in discussing over Unicode/tbl
here... :)


I've attached a new set of patches... the second replaces all the
unicode stuff with similar ASCII chars.
I'd suggest to merge both and not just the result of them, so we have
the fancy Unicode stuff in git as well, should we ever decide to upgrade
to post 1991 ;-)


Hope that helps and you can merge them largely as is,... please tell me
whether or not (or whether other tweaks are needed)... so that I can
clean up that branch.

Still have another branch with information you gave me back then, about
how reads and writes are done... I'll come up with that in another mail.

Cheers,
Chris.

[-- Attachment #1.2: 0002-replace-all-Unicode-with-similar-ASCII-characters.patch --]
[-- Type: text/x-patch, Size: 10985 bytes --]

From 5a3ffe31ee2829b6f6cf2ff54bf50e8dc5d5b148 Mon Sep 17 00:00:00 2001
From: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Date: Mon, 16 Jun 2014 18:13:39 +0200
Subject: [PATCH 2/2] replace all Unicode with similar ASCII characters
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Well apparently mdadm should stay in the pre 1991 ages where people were limited
to ASCII or some other obscure codepages.

• Replaced all of the Unicode characters introduced in
  commit 2504c5e675b2085110fb76ecc151eba4eeaaa6ab with more or less similar
  ASCII-only replacements.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
---
 md.4 | 142 +++++++++++++++++++++++++++++++++----------------------------------
 1 file changed, 71 insertions(+), 71 deletions(-)

diff --git a/md.4 b/md.4
index c9180a0..af90109 100644
--- a/md.4
+++ b/md.4
@@ -284,26 +284,26 @@ by the kernel (for example \fB/dev/md/\fPname).
 Therefore the chunks\ /\ chunk numbers map directly to the blocks\ /\ block
 addresses of the exported RAID10 device.
 
-Decimal numbers (0,\ 1, 2,\ …) are the chunks of the RAID10 and due to the above
+Decimal numbers (0,\ 1, 2,\ ...) are the chunks of the RAID10 and due to the above
 assumption also the blocks and block addresses of the exported RAID10 device.
 .br
 Same numbers mean copies of a chunk\ /\ block (obviously on different underlying
 devices).
 .br
-Hexadecimal numbers (0x00,\ 0x01, 0x02,\ …) are the block addresses of the
+Hexadecimal numbers (0x00,\ 0x01, 0x02,\ ...) are the block addresses of the
 underlying devices.
 
 
 .TP
 .B 'near' Layout
 When 'near' replicas are chosen, the multiple copies of a given chunk are laid
-out consecutively (“as close to each other as possible”) across the stripes of
+out consecutively ("as close to each other as possible") across the stripes of
 the array.
 
 With an even number of devices, they will likely (unless some misalignment is
 present) lay at the very same offset on the different devices.
 .br
-This is as the “classic” RAID1+0; that is two groups of mirrored devices (in the
+This is as the "classic" RAID1+0; that is two groups of mirrored devices (in the
 example below the groups Device\ #1\ /\ #2 and Device\ #3\ /\ #4 are each a
 RAID1) both in turn forming a striped RAID0.
 
@@ -328,13 +328,13 @@ tab(;);
 ;Device #1;Device #2;Device #3;Device #4
 0x00;0;0;1;1
 0x01;2;2;3;3
-⋯;⋯;⋯;⋯;⋯
-⋮;⋮;⋮;⋮;⋮
-⋯;⋯;⋯;⋯;⋯
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.
+:;:;:;:;:
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.
 0x80;254;254;255;255
-;╰─────────┬─────────╯;╰─────────┬─────────╯
+;\\---------v---------/;\\---------v---------/
 ;RAID1;RAID1
-;╰─────────────────────┬─────────────────────╯
+;\\---------------------v---------------------/
 ;RAID0
 .TE
 
@@ -356,9 +356,9 @@ C.
 ;Device #1;Device #2;Device #3;Device #4;Device #5
 0x00;0;0;1;1;2
 0x01;2;3;3;4;4
-⋯;⋯;⋯;⋯;⋯;⋯
-⋮;⋮;⋮;⋮;⋮;⋮
-⋯;⋯;⋯;⋯;⋯;⋯
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.
+:;:;:;:;:;:
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.
 0x80;317;318;318;319;319
 ;
 .TE
@@ -368,29 +368,29 @@ C.
 .TP
 .B 'far' Layout
 When 'far' replicas are chosen, the multiple copies of a given chunk
-are laid out quite distant (“as far as reasonably possible”) from each other.
+are laid out quite distant ("as far as reasonably possible") from each other.
 
 First a complete sequence of all data blocks (that is all the data one sees on
 the exported RAID10 block device) is striped over the devices. Then a another
-(though “shifted”) complete sequence of all data blocks; and so on (in the case
+(though "shifted") complete sequence of all data blocks; and so on (in the case
 of more than 2\ copies per chunk).
 
-The “shift” needed to prevent placing copies of the same chunks on the same
+The "shift" needed to prevent placing copies of the same chunks on the same
 devices is actually a cyclic permutation with offset\ 1 of each of the stripes
 within a complete sequence of chunks.
 .br
 The offset\ 1 is relative to the previous complete sequence of chunks, so in
 case of more than 2\ copies per chunk one gets the following offsets:
 .br
-1.\ complete sequence of chunks: offset\ ≔\ \ 0
+1.\ complete sequence of chunks: offset\ =\ \ 0
 .br
-2.\ complete sequence of chunks: offset\ ≔\ \ 1
+2.\ complete sequence of chunks: offset\ =\ \ 1
 .br
-3.\ complete sequence of chunks: offset\ ≔\ \ 2
+3.\ complete sequence of chunks: offset\ =\ \ 2
 .br
-                       ⋮
+                       :
 .br
-n.\ complete sequence of chunks: offset\ ≔\ n−1
+n.\ complete sequence of chunks: offset\ =\ n-1
 
 .B Example with 2\ copies per chunk and an even number\ (4) of devices:
 .TS
@@ -415,18 +415,18 @@ C.
 ;
 ;Device #1;Device #2;Device #3;Device #4
 ;
-0x00;0;1;2;3;╮
-0x01;4;5;6;7;├ ▒
-⋯;⋯;⋯;⋯;⋯;┆
-⋮;⋮;⋮;⋮;⋮;┆
-⋯;⋯;⋯;⋯;⋯;┆
-0x40;252;253;254;255;╯
-0x41;3;0;1;2;╮
-0x42;7;4;5;6;├ ▒↻
-⋯;⋯;⋯;⋯;⋯;┆
-⋮;⋮;⋮;⋮;⋮;┆
-⋯;⋯;⋯;⋯;⋯;┆
-0x80;255;252;253;254;╯
+0x00;0;1;2;3;\\ 
+0x01;4;5;6;7;> [#]
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;:
+:;:;:;:;:;:
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;:
+0x40;252;253;254;255;/
+0x41;3;0;1;2;\\ 
+0x42;7;4;5;6;> [#]~
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;:
+:;:;:;:;:;:
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;:
+0x80;255;252;253;254;/
 ;
 .TE
 
@@ -453,24 +453,24 @@ C.
 ;
 ;Device #1;Device #2;Device #3;Device #4;Device #5
 ;
-0x00;0;1;2;3;4;╮
-0x01;5;6;7;8;9;├ ▒
-⋯;⋯;⋯;⋯;⋯;⋯;┆
-⋮;⋮;⋮;⋮;⋮;⋮;┆
-⋯;⋯;⋯;⋯;⋯;⋯;┆
-0x40;315;316;317;318;319;╯
-0x41;4;0;1;2;3;╮
-0x42;9;5;6;7;8;├ ▒↻
-⋯;⋯;⋯;⋯;⋯;⋯;┆
-⋮;⋮;⋮;⋮;⋮;⋮;┆
-⋯;⋯;⋯;⋯;⋯;⋯;┆
-0x80;319;315;316;317;318;╯
+0x00;0;1;2;3;4;\\ 
+0x01;5;6;7;8;9;> [#]
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;:
+:;:;:;:;:;:;:
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;:
+0x40;315;316;317;318;319;/
+0x41;4;0;1;2;3;\\ 
+0x42;9;5;6;7;8;> [#]~
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;:
+:;:;:;:;:;:;:
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;:
+0x80;319;315;316;317;318;/
 ;
 .TE
 
-With ▒\ being the complete sequence of chunks and ▒↻\ the cyclic permutation
+With [#]\ being the complete sequence of chunks and [#]~\ the cyclic permutation
 with offset\ 1 thereof (in the case of more than 2 copies per chunk there would
-be (▒↻)↻,\ ((▒↻)↻)↻,\ …).
+be ([#]~)~,\ (([#]~)~)~,\ ...).
 
 The advantage of this layout is that MD can easily spread sequential reads over
 the devices, making them similar to RAID0 in terms of speed.
@@ -482,16 +482,16 @@ The cost is more seeking for writes, making them substantially slower.
 .TP
 .B 'offset' Layout
 When 'offset' replicas are chosen, all the copies of a given chunk are striped
-consecutively (“offset by the stripe length after each other”) over the devices.
+consecutively ("offset by the stripe length after each other") over the devices.
 
 Explained in detail, <number of devices> consecutive chunks are striped over the
-devices, immediately followed by a “shifted” copy of these chunks (and by
-further such “shifted” copies in the case of more than 2\ copies per chunk).
+devices, immediately followed by a "shifted" copy of these chunks (and by
+further such "shifted" copies in the case of more than 2\ copies per chunk).
 .br
 This pattern repeats for all further consecutive chunks of the exported RAID10
 device (in other words: all further data blocks).
 
-The “shift” needed to prevent placing copies of the same chunks on the same
+The "shift" needed to prevent placing copies of the same chunks on the same
 devices is actually a cyclic permutation with offset\ 1 of each of the striped
 copies of <number of devices> consecutive chunks.
 .br
@@ -499,15 +499,15 @@ The offset\ 1 is relative to the previous striped copy of <number of devices>
 consecutive chunks, so in case of more than 2\ copies per chunk one gets the
 following offsets:
 .br
-1.\ <number of devices> consecutive chunks: offset\ ≔\ \ 0
+1.\ <number of devices> consecutive chunks: offset\ =\ \ 0
 .br
-2.\ <number of devices> consecutive chunks: offset\ ≔\ \ 1
+2.\ <number of devices> consecutive chunks: offset\ =\ \ 1
 .br
-3.\ <number of devices> consecutive chunks: offset\ ≔\ \ 2
+3.\ <number of devices> consecutive chunks: offset\ =\ \ 2
 .br
-                             ⋮
+                             :
 .br
-n.\ <number of devices> consecutive chunks: offset\ ≔\ n−1
+n.\ <number of devices> consecutive chunks: offset\ =\ n-1
 
 .B Example with 2\ copies per chunk and an even number\ (4) of devices:
 .TS
@@ -530,14 +530,14 @@ C.
 ;Device #1;Device #2;Device #3;Device #4
 ;
 0x00;0;1;2;3;) AA
-0x01;3;0;1;2;) AA↻
+0x01;3;0;1;2;) AA~
 0x02;4;5;6;7;) AB
-0x03;7;4;5;6;) AB↻
-⋯;⋯;⋯;⋯;⋯;) ⋯
-⋮;⋮;⋮;⋮;⋮;  ⋮
-⋯;⋯;⋯;⋯;⋯;) ⋯
+0x03;7;4;5;6;) AB~
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;) \.\.\.
+:;:;:;:;:;  :
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;) \.\.\.
 0x79;251;252;253;254;) EX
-0x80;254;251;252;253;) EX↻
+0x80;254;251;252;253;) EX~
 ;
 .TE
 
@@ -562,21 +562,21 @@ C.
 ;Device #1;Device #2;Device #3;Device #4;Device #5
 ;
 0x00;0;1;2;3;4;) AA
-0x01;4;0;1;2;3;) AA↻
+0x01;4;0;1;2;3;) AA~
 0x02;5;6;7;8;9;) AB
-0x03;9;5;6;7;8;) AB↻
-⋯;⋯;⋯;⋯;⋯;⋯;) ⋯
-⋮;⋮;⋮;⋮;⋮;⋮;  ⋮
-⋯;⋯;⋯;⋯;⋯;⋯;) ⋯
+0x03;9;5;6;7;8;) AB~
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;) \.\.\.
+:;:;:;:;:;:;  :
+\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;\.\.\.;) \.\.\.
 0x79;314;315;316;317;318;) EX
-0x80;318;314;315;316;317;) EX↻
+0x80;318;314;315;316;317;) EX~
 ;
 .TE
 
-With AA,\ AB,\ …, AZ,\ BA,\ … being the sets of <number of devices> consecutive
-chunks and AA↻,\ AB↻,\ …, AZ↻,\ BA↻,\ … the cyclic permutations with offset\ 1
-thereof (in the case of more than 2 copies per chunk there would be (AA↻)↻,\ …
-as well as ((AA↻)↻)↻,\ … and so on).
+With AA,\ AB,\ ..., AZ,\ BA,\ ... being the sets of <number of devices> consecutive
+chunks and AA~,\ AB~,\ ..., AZ~,\ BA~,\ ... the cyclic permutations with offset\ 1
+thereof (in the case of more than 2 copies per chunk there would be (AA~)~,\ ...
+as well as ((AA~)~)~,\ ... and so on).
 
 This should give similar read characteristics to 'far' if a suitably large chunk
 size is used, but without as much seeking for writes.
-- 
2.0.0


[-- Attachment #1.3: 0001-thoroughly-document-the-RAID10-layouts.patch --]
[-- Type: text/x-patch, Size: 12901 bytes --]

From 2504c5e675b2085110fb76ecc151eba4eeaaa6ab Mon Sep 17 00:00:00 2001
From: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Date: Mon, 16 Jun 2014 17:29:51 +0200
Subject: [PATCH 1/2] thoroughly document the RAID10 layouts
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

• Add some detailed documentation for the RAID10 layouts “near”, “far” and
  “offset”.
  Each layout is documented in general and via two examples (one with an odd and
  one with an even number of underlying devices).
• Enable tbl processing fo groff in the Makefile, since the documentation uses
  it for drawing tables.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
---
 Makefile |   2 +-
 md.4     | 336 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 314 insertions(+), 24 deletions(-)

diff --git a/Makefile b/Makefile
index 1a4a5dc..03966c9 100644
--- a/Makefile
+++ b/Makefile
@@ -241,7 +241,7 @@ mdmon.man : mdmon.8
 	nroff -man mdmon.8 > mdmon.man
 
 md.man : md.4
-	nroff -man md.4 > md.man
+	nroff -man -t md.4 > md.man
 
 mdadm.conf.man : mdadm.conf.5
 	nroff -man mdadm.conf.5 > mdadm.conf.man
diff --git a/md.4 b/md.4
index 5f6c3a7..c9180a0 100644
--- a/md.4
+++ b/md.4
@@ -266,32 +266,322 @@ as RAID1+0.  Every datablock is duplicated some number of times, and
 the resulting collection of datablocks are distributed over multiple
 drives.
 
-When configuring a RAID10 array, it is necessary to specify the number
-of replicas of each data block that are required (this will normally
-be 2) and whether the replicas should be 'near', 'offset' or 'far'.
-(Note that the 'offset' layout is only available from 2.6.18).
+When configuring a RAID10 array, it is necessary to specify the number of
+replicas of each data block that are required (this will usually be\ 2) and
+whether their layout should be 'near', 'far' or 'offset' (with 'offset' being
+available since Linux\ 2.6.18).
 
-When 'near' replicas are chosen, the multiple copies of a given chunk
-are laid out consecutively across the stripes of the array, so the two
-copies of a datablock will likely be at the same offset on two
-adjacent devices.
 
+.B About the RAID10 Layout Examples:
+.br
+The examples below visualise the chunk distribution on the underlying devices
+for the respective layout.
+
+For simplicity it is assumed that the size of the chunks equals the size of the
+blocks of the underlying devices as well as those of the RAID10 device exported
+by the kernel (for example \fB/dev/md/\fPname).
+.br
+Therefore the chunks\ /\ chunk numbers map directly to the blocks\ /\ block
+addresses of the exported RAID10 device.
+
+Decimal numbers (0,\ 1, 2,\ …) are the chunks of the RAID10 and due to the above
+assumption also the blocks and block addresses of the exported RAID10 device.
+.br
+Same numbers mean copies of a chunk\ /\ block (obviously on different underlying
+devices).
+.br
+Hexadecimal numbers (0x00,\ 0x01, 0x02,\ …) are the block addresses of the
+underlying devices.
+
+
+.TP
+.B 'near' Layout
+When 'near' replicas are chosen, the multiple copies of a given chunk are laid
+out consecutively (“as close to each other as possible”) across the stripes of
+the array.
+
+With an even number of devices, they will likely (unless some misalignment is
+present) lay at the very same offset on the different devices.
+.br
+This is as the “classic” RAID1+0; that is two groups of mirrored devices (in the
+example below the groups Device\ #1\ /\ #2 and Device\ #3\ /\ #4 are each a
+RAID1) both in turn forming a striped RAID0.
+
+.B Example with 2\ copies per chunk and an even number\ (4) of devices:
+.TS
+tab(;);
+  C   -   -   -   -
+  C | C | C | C | C |
+| - | - | - | - | - |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| C | C | C | C | C |
+| - | - | - | - | - |
+  C   C   S   C   S
+  C   C   S   C   S
+  C   C   S   S   S
+  C   C   S   S   S.
+;
+;Device #1;Device #2;Device #3;Device #4
+0x00;0;0;1;1
+0x01;2;2;3;3
+⋯;⋯;⋯;⋯;⋯
+⋮;⋮;⋮;⋮;⋮
+⋯;⋯;⋯;⋯;⋯
+0x80;254;254;255;255
+;╰─────────┬─────────╯;╰─────────┬─────────╯
+;RAID1;RAID1
+;╰─────────────────────┬─────────────────────╯
+;RAID0
+.TE
+
+.B Example with 2\ copies per chunk and an odd number\ (5) of devices:
+.TS
+tab(;);
+  C   -   -   -   -   -
+  C | C | C | C | C | C |
+| - | - | - | - | - | - |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| C | C | C | C | C | C |
+| - | - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4;Device #5
+0x00;0;0;1;1;2
+0x01;2;3;3;4;4
+⋯;⋯;⋯;⋯;⋯;⋯
+⋮;⋮;⋮;⋮;⋮;⋮
+⋯;⋯;⋯;⋯;⋯;⋯
+0x80;317;318;318;319;319
+;
+.TE
+.PP
+
+
+.TP
+.B 'far' Layout
 When 'far' replicas are chosen, the multiple copies of a given chunk
-are laid out quite distant from each other.  The first copy of all
-data blocks will be striped across the early part of all drives in
-RAID0 fashion, and then the next copy of all blocks will be striped
-across a later section of all drives, always ensuring that all copies
-of any given block are on different drives.
-
-The 'far' arrangement can give sequential read performance equal to
-that of a RAID0 array, but at the cost of reduced write performance.
-
-When 'offset' replicas are chosen, the multiple copies of a given
-chunk are laid out on consecutive drives and at consecutive offsets.
-Effectively each stripe is duplicated and the copies are offset by one
-device.   This should give similar read characteristics to 'far' if a
-suitably large chunk size is used, but without as much seeking for
-writes.
+are laid out quite distant (“as far as reasonably possible”) from each other.
+
+First a complete sequence of all data blocks (that is all the data one sees on
+the exported RAID10 block device) is striped over the devices. Then a another
+(though “shifted”) complete sequence of all data blocks; and so on (in the case
+of more than 2\ copies per chunk).
+
+The “shift” needed to prevent placing copies of the same chunks on the same
+devices is actually a cyclic permutation with offset\ 1 of each of the stripes
+within a complete sequence of chunks.
+.br
+The offset\ 1 is relative to the previous complete sequence of chunks, so in
+case of more than 2\ copies per chunk one gets the following offsets:
+.br
+1.\ complete sequence of chunks: offset\ ≔\ \ 0
+.br
+2.\ complete sequence of chunks: offset\ ≔\ \ 1
+.br
+3.\ complete sequence of chunks: offset\ ≔\ \ 2
+.br
+                       ⋮
+.br
+n.\ complete sequence of chunks: offset\ ≔\ n−1
+
+.B Example with 2\ copies per chunk and an even number\ (4) of devices:
+.TS
+tab(;);
+  C   -   -   -   -
+  C | C | C | C | C |
+| - | - | - | - | - |
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4
+;
+0x00;0;1;2;3;╮
+0x01;4;5;6;7;├ ▒
+⋯;⋯;⋯;⋯;⋯;┆
+⋮;⋮;⋮;⋮;⋮;┆
+⋯;⋯;⋯;⋯;⋯;┆
+0x40;252;253;254;255;╯
+0x41;3;0;1;2;╮
+0x42;7;4;5;6;├ ▒↻
+⋯;⋯;⋯;⋯;⋯;┆
+⋮;⋮;⋮;⋮;⋮;┆
+⋯;⋯;⋯;⋯;⋯;┆
+0x80;255;252;253;254;╯
+;
+.TE
+
+.B Example with 2\ copies per chunk and an odd number\ (5) of devices:
+.TS
+tab(;);
+  C   -   -   -   -   -
+  C | C | C | C | C | C |
+| - | - | - | - | - | - |
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| - | - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4;Device #5
+;
+0x00;0;1;2;3;4;╮
+0x01;5;6;7;8;9;├ ▒
+⋯;⋯;⋯;⋯;⋯;⋯;┆
+⋮;⋮;⋮;⋮;⋮;⋮;┆
+⋯;⋯;⋯;⋯;⋯;⋯;┆
+0x40;315;316;317;318;319;╯
+0x41;4;0;1;2;3;╮
+0x42;9;5;6;7;8;├ ▒↻
+⋯;⋯;⋯;⋯;⋯;⋯;┆
+⋮;⋮;⋮;⋮;⋮;⋮;┆
+⋯;⋯;⋯;⋯;⋯;⋯;┆
+0x80;319;315;316;317;318;╯
+;
+.TE
+
+With ▒\ being the complete sequence of chunks and ▒↻\ the cyclic permutation
+with offset\ 1 thereof (in the case of more than 2 copies per chunk there would
+be (▒↻)↻,\ ((▒↻)↻)↻,\ …).
+
+The advantage of this layout is that MD can easily spread sequential reads over
+the devices, making them similar to RAID0 in terms of speed.
+.br
+The cost is more seeking for writes, making them substantially slower.
+.PP
+
+
+.TP
+.B 'offset' Layout
+When 'offset' replicas are chosen, all the copies of a given chunk are striped
+consecutively (“offset by the stripe length after each other”) over the devices.
+
+Explained in detail, <number of devices> consecutive chunks are striped over the
+devices, immediately followed by a “shifted” copy of these chunks (and by
+further such “shifted” copies in the case of more than 2\ copies per chunk).
+.br
+This pattern repeats for all further consecutive chunks of the exported RAID10
+device (in other words: all further data blocks).
+
+The “shift” needed to prevent placing copies of the same chunks on the same
+devices is actually a cyclic permutation with offset\ 1 of each of the striped
+copies of <number of devices> consecutive chunks.
+.br
+The offset\ 1 is relative to the previous striped copy of <number of devices>
+consecutive chunks, so in case of more than 2\ copies per chunk one gets the
+following offsets:
+.br
+1.\ <number of devices> consecutive chunks: offset\ ≔\ \ 0
+.br
+2.\ <number of devices> consecutive chunks: offset\ ≔\ \ 1
+.br
+3.\ <number of devices> consecutive chunks: offset\ ≔\ \ 2
+.br
+                             ⋮
+.br
+n.\ <number of devices> consecutive chunks: offset\ ≔\ n−1
+
+.B Example with 2\ copies per chunk and an even number\ (4) of devices:
+.TS
+tab(;);
+  C   -   -   -   -
+  C | C | C | C | C |
+| - | - | - | - | - |
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| C | C | C | C | C | L
+| - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4
+;
+0x00;0;1;2;3;) AA
+0x01;3;0;1;2;) AA↻
+0x02;4;5;6;7;) AB
+0x03;7;4;5;6;) AB↻
+⋯;⋯;⋯;⋯;⋯;) ⋯
+⋮;⋮;⋮;⋮;⋮;  ⋮
+⋯;⋯;⋯;⋯;⋯;) ⋯
+0x79;251;252;253;254;) EX
+0x80;254;251;252;253;) EX↻
+;
+.TE
+
+.B Example with 2\ copies per chunk and an odd number\ (5) of devices:
+.TS
+tab(;);
+  C   -   -   -   -   -
+  C | C | C | C | C | C |
+| - | - | - | - | - | - |
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| C | C | C | C | C | C | L
+| - | - | - | - | - | - |
+C.
+;
+;Device #1;Device #2;Device #3;Device #4;Device #5
+;
+0x00;0;1;2;3;4;) AA
+0x01;4;0;1;2;3;) AA↻
+0x02;5;6;7;8;9;) AB
+0x03;9;5;6;7;8;) AB↻
+⋯;⋯;⋯;⋯;⋯;⋯;) ⋯
+⋮;⋮;⋮;⋮;⋮;⋮;  ⋮
+⋯;⋯;⋯;⋯;⋯;⋯;) ⋯
+0x79;314;315;316;317;318;) EX
+0x80;318;314;315;316;317;) EX↻
+;
+.TE
+
+With AA,\ AB,\ …, AZ,\ BA,\ … being the sets of <number of devices> consecutive
+chunks and AA↻,\ AB↻,\ …, AZ↻,\ BA↻,\ … the cyclic permutations with offset\ 1
+thereof (in the case of more than 2 copies per chunk there would be (AA↻)↻,\ …
+as well as ((AA↻)↻)↻,\ … and so on).
+
+This should give similar read characteristics to 'far' if a suitably large chunk
+size is used, but without as much seeking for writes.
+.PP
+
 
 It should be noted that the number of devices in a RAID10 array need
 not be a multiple of the number of replica of each data block; however,
-- 
2.0.0


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5313 bytes --]

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: PATCH: RAID10-layout-descriptions
  2014-06-16 16:34   ` Christoph Anton Mitterer
@ 2014-07-03  6:58     ` NeilBrown
  0 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2014-07-03  6:58 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 2117 bytes --]

On Mon, 16 Jun 2014 18:34:28 +0200 Christoph Anton Mitterer
<calestyo@scientia.net> wrote:

> On Mon, 2014-06-16 at 06:43 +1000, NeilBrown wrote: 
> > tbl I can live with.  Unicode I cannot.
> > In some contexts Unicode may be ok (non-English words) but not for
> > line-drawing characters and not for special punctuation.
> Well but you know, that tbl(1) won't work for your PDF/HTML rendering
> either? At least it didn't when I checked it.

True, it won't work for some renderers.  That is unfortunate but I am willing
to make some sacrifices I guess.

 man -l -Thtml md.4 > md.html

creates some HTML and some png files which look readable in a browser,
including the tables.

> 
> Anyway... I guess there's no benefit in discussing over Unicode/tbl
> here... :)
> 
> 
> I've attached a new set of patches... the second replaces all the
> unicode stuff with similar ASCII chars.
> I'd suggest to merge both and not just the result of them, so we have
> the fancy Unicode stuff in git as well, should we ever decide to upgrade
> to post 1991 ;-)

Thanks for being accommodating of my irrational preferences.

> 
> 
> Hope that helps and you can merge them largely as is,... please tell me
> whether or not (or whether other tweaks are needed)... so that I can
> clean up that branch.

I've merged all your patches together and made a few little modification of
my own - nothing major.
I changed the Makefile to use "man -l filename" as that seems to be the
"right" thing to do, rather than "nroff -man".  So tbl is handled correctly.

I've added a separate patch with the biggest change I made which was to
replace "Device" with "Dev" in the 5-device arrays so that that table fits in
the width of a (standard, old fashioned, 72 column) page.

> 
> Still have another branch with information you gave me back then, about
> how reads and writes are done... I'll come up with that in another mail.

Should I just pull your description-of-reads-and-writes branch (and impose my
style requirements on it)?

Thanks for your efforts and persistence!!

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: PATCH: RAID10-layout-descriptions
  2014-06-15 17:22 PATCH: RAID10-layout-descriptions Christoph Anton Mitterer
  2014-06-15 20:43 ` NeilBrown
@ 2014-07-03  9:40 ` keld
  1 sibling, 0 replies; 5+ messages in thread
From: keld @ 2014-07-03  9:40 UTC (permalink / raw)
  To: Christoph Anton Mitterer; +Cc: linux-raid

Hi all

The updated text is actually techninaclly misleading or plainly wrong, for the description
of the far and offset layouts. Offset cannot compete with far, as also witnessed by many benchmarks,
as described on the wiki. https://raid.wiki.kernel.org/index.php/Performance

The old text is much more precise than the new. So I advise
that we do not apply the patch, or do another one with the small improvements there
is still in the new patches.

Best regards
Keld

On Sun, Jun 15, 2014 at 07:22:23PM +0200, Christoph Anton Mitterer wrote:
> Hi Neil.
> 
> As mentioned on GitHub before I'm trying to clean up some old patches or
> get them merged.
> 
> 
> I lost a bit track on what we've discussed about them before,...
> 
> One thing I remember is that you didn't like unicode and tbl(1) being
> used.
> 
> Well of course we can talk about that again,... but I think this is
> 2014, so literally everyone has unicode and I think the explanations
> benefit from it (actually I see more and more manpages using unicode).
> 
> With respect to tbl(1) and the box drawings... I think you were
> complaining that this doesn't work with groff when rendering e.g. to
> PDF.... well I guess you're right, but the question is probably: is
> anyone in the world doing this?
> I mean for manpages it seems to work quite well and IMHO improves
> readability and understandability of the explanations quite a lot... and
> we can't just cover any side way on how the nroff files might be used,
> and for which rendering doesn't work.
> 
> 
> After all,... I think the patches below contain lots of valuable
> information which is currently missing in the manpages... so having that
> information merged somehow is surely better than not.
> Actually the same is IMHO fully missing for the different RAID 5/6
> layouts.
> 
> I'd be happy if someone could look into spelling issues and that like.
> 
> Cheers,
> Chris.









^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-07-03  9:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-15 17:22 PATCH: RAID10-layout-descriptions Christoph Anton Mitterer
2014-06-15 20:43 ` NeilBrown
2014-06-16 16:34   ` Christoph Anton Mitterer
2014-07-03  6:58     ` NeilBrown
2014-07-03  9:40 ` keld

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).