From: Loic Dachary <loic@dachary.org>
To: Koleos Fuskus <koleosfuscus@yahoo.com>
Cc: Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: Pyramid erasure code description revisited
Date: Mon, 02 Jun 2014 20:49:57 +0200 [thread overview]
Message-ID: <538CC755.7000708@dachary.org> (raw)
In-Reply-To: <1401733713.18379.YahooMailNeo@web165006.mail.bf1.yahoo.com>
[-- Attachment #1: Type: text/plain, Size: 4768 bytes --]
Hi koleosfuscus,
A simpler proposal was made a few days ago. As you rightfully point out, the previous one was a bit complicated to understand ;-)
http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/19753
Cheers
On 02/06/2014 20:28, Koleos Fuskus wrote:
> Hi Loic,
> I am trying to understand your proposal on http://pad.ceph.com/p/cdsgiant-pyramid-erasure-code
> Is the mapping specification a new feature on CRUSH to support Pyramid Codes?
> I don't follow from line 72, when you are talking about "crush multidatacenter mapping".
> Adding a failure domain typically adds a new level in the pyramid using xor?
>
> "** if one chunk is missing, will return that all chunks from the local cluster are needed"
> If one chunk is missing, it recovers it using xor instead of jerasure?
> "** if two chunks are missing in the same local cluster, it will defer to the global level"
> In this case it has the pyramid code doesn't help, does it?
> ** if two chunks are missing, each of them in a different local cluster, it will return that it needs all chunks from both local cluster but will not defer to the upper level
>
> Best,
> koleos
>
>
>
> On Monday, June 2, 2014 3:14 PM, Loic Dachary <loic@dachary.org> wrote:
> Hi Andreas,
>
> On 02/06/2014 14:20, Andreas Joachim Peters wrote:> Hi Loic,
>>
>> I think this gives all the flexibility to define any possible combination for encoding ...
>>
>> When one constructs the steps one has just to be aware that the 'most local' encoding should happen in the end, right?
>
> Yes.
>
>>
>> It would be usefule to have a tool which outputs then for each data aND parity chunk the achieved 'redundancy' and the overall volume and maximal reconstruction 'overhead'.
>
> Right. I'm kind of hoping koleosfuscus (cc'ed) will be able to fit that into the reliability model, but we've not discussed that yet. In any case you are right, a small command line tool would be helpful. Something that would explain: if you loose one of the chunks you need four to recover. If you lose two you need all of them. That's more humanly readable and understandable than the full description ;-)
>
> Cheers
>
>>
>> Cheers Andreas.
>>
>> ________________________________________
>> From: Loic Dachary [loic@dachary.org]
>> Sent: 31 May 2014 19:10
>> To: Andreas Joachim Peters
>> Cc: Ceph Development
>> Subject: Pyramid erasure code description revisited
>>
>> Hi Andreas,
>>
>> After a few weeks and a fresh eye, I revisited the way pyramid erasure code could be described by the system administrator. Here is a proposal that is hopefully more intuitive than the one from the last CDS ( http://pad.ceph.com/p/cdsgiant-pyramid-erasure-code ).
>>
>> These are the steps to create all coding chunks. The upper case letters are data chunks and the lower case letters are coding chunks.
>>
>> "__ABC__DE_" data chunks placement
>>
>> Step 1
>> "__ABC__DE_"
>> "_yVWX_zYZ_" K=5, M=2
>> "_aABC_bDE_"
>>
>> Step 2
>> "_aABC_bDE_"
>> "z_XYZ_____" K=3, M=1
>> "caABC_bDE_"
>>
>> Step 3
>> "caABC_bDE_"
>> "_____zXYZ_" K=3, M=1
>> "caABCdbDE_"
>>
>> Step 4
>> "caABCdbDE_"
>> "_____WXYZz" K=4, M=1
>> "caABCdbDEe"
>>
>> The interpretation of Step 3 is as follows:
>>
>> Given the output of the previous step ( "caABC_bDE_" ), the bDE chunks are considered to be data chunks at this stage and they are marked with XYZ. A K=3, M=1 coding chunk is calculated and placed in the chunk marked with z ( "_____zXYZ_" ). The output of this coding step is the previous step plus the coding chunk that was just calculated, named d ( "caABCdbDE_" ).
>>
>> This gives the flexibility of deciding wether or not a coding chunk from a previous step is used as data to compute the coding chunk of the next step. It also allows for unbalanced steps such as step 4.
>>
>> For decoding, the steps are walked from the bottom up. If E is missing, it can be reconstructed from dbD.e in step 4 and the other steps are skipped because it was the only missing chunk. If AB are missing, all steps that have not be used to encode it are ignored, up to step 2 that will fail to recover them because M=1 and yeild to step 1 that will use a..CbDE successfully because M=2.
>>
>> Giving up the recursion and favor iteration seems to simplify how it can be explained. And I suspect the implementation is also simpler. What do you think ?
>>
>> Cheers
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
next prev parent reply other threads:[~2014-06-02 18:50 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-31 17:10 Pyramid erasure code description revisited Loic Dachary
2014-06-02 12:20 ` Andreas Joachim Peters
2014-06-02 13:14 ` Loic Dachary
[not found] ` <1401733713.18379.YahooMailNeo@web165006.mail.bf1.yahoo.com>
2014-06-02 18:49 ` Loic Dachary [this message]
2014-06-05 14:05 ` Locally repairable code description revisited (was Pyramid ...) Loic Dachary
2014-06-06 11:46 ` Andreas Joachim Peters
2014-06-06 14:30 ` Loic Dachary
2014-06-09 20:18 ` Gregory Farnum
2014-06-09 20:38 ` Samuel Just
2014-06-09 21:40 ` Loic Dachary
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=538CC755.7000708@dachary.org \
--to=loic@dachary.org \
--cc=ceph-devel@vger.kernel.org \
--cc=koleosfuscus@yahoo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).