From: Loic Dachary <loic@dachary.org>
To: James Plank <plank@cs.utk.edu>
Cc: "ceph-devel@vger.kernel.org" <ceph-devel@vger.kernel.org>
Subject: Re: Comments on Ceph distributed parity implementation
Date: Thu, 20 Jun 2013 20:25:54 +0200 [thread overview]
Message-ID: <51C34932.4080304@dachary.org> (raw)
In-Reply-To: <3E438C2F-0779-4824-9C05-ABE4B5803E05@cs.utk.edu>
[-- Attachment #1: Type: text/plain, Size: 3781 bytes --]
On 06/18/2013 04:22 PM, James Plank wrote:
> Hi all -- thank you for including me on this thread, although I have little substantive to add. At the moment, my sole focus is finishing a journal paper about GF implementations, with a concomitant GF-complete release to accompany it. I agree that the CPU burden of the GF arithmetic will not be a bottleneck in your system, regardless of which implementation you use, as long as you stay at or below GF(2^16). If you want to go higher, GF-complete will help. When we put out a new release (the code will be ready within two weeks, however, the documentation is lagging), I'll let you know. I think LRC is a nice coding paradigm, although I imagine that it has IP issues with Microsoft. I don't have first-hand experience with network/regenerating codes, and I'll be honest -- there have been so many papers in that realm that I am not up to date on them.
>
> Is there a question on which you'd like some help? It sounds as though you are at two decision points: What code should you use, and at which point on the space-overhead/fault-tolerance curve would you like to be?
Hi James,
Unless someone objects it looks like Ceph going to use jerasure-1.2 with reed-solomon. I'm glad to hear that GF arithmetic will not be a bottleneck : we're going to stay below GF(2^8). However minimizing the CPU footprint is essential and I'm looking forward to use the next version including the SIMD optimizations that you demonstrated in gf-complete.
I wrote down a short description of the read/write path I plan to implement in ceph : https://github.com/dachary/ceph/blob/wip-4929/doc/dev/osd_internals/erasure-code.rst . A quick look at the drawings will hopefully give you an idea. Each OSD is a disk connected to the others over the network. Although I chose K+M = 5 I suspect the most common use case will be around K+M = 7+3 = 10
I've seen that jerasure-1.2 not only provides classic reed-solomon but also cauchy reed-solomon and liberation / minimal density MDS codes. I assume classic reed-solomon is best suited for the default Ceph use case described above but I'm not sure. What do you think ?
Thanks a lot for your advices :-) It helps me write sensible code.
Cheers
>
> Best wishes,
>
> Jim
> ----------
>
> On Jun 18, 2013, at 3:44 AM, Benoît Parrein wrote:
>
>> Hi Paul,
>>
>> thank you for your message
>>
>> from my point, LRC focuses on the repairing problem. how to reconstruct destroyed node to maintain the same availability by the distributed system?
>> in this context they can even go below 1x rate by introducing local parity on classical Reed Solomon blocks (but they pay a supplementary overhead). see excellent Alex Dimakis's papers for that. but, still from my point, the same relationship between redundancy and availability occurs (if you consider binomial model for your loses).
>>
>> best
>> bp
>>
>>
>> Le 17/06/2013 18:55, Paul Von-Stamwitz a écrit :
>>> Loic,
>>>
>>> As Benoit points out, Mojette uses discrete geometry rather than algebra, so simple XOR is all that is needed.
>>>
>>> Benoit,
>>>
>>> Microsoft's paper states that their [12,2,2] LRC provides better availability than 3x replication with 1.33x efficiency. 1.5x is certainly a good number. I'm just pointing out that better efficiency can be had without losing availibity.
>>>
>>> All the best,
>>> Paul
>>
>> <benoit_parrein.vcf>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
next prev parent reply other threads:[~2013-06-20 18:25 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-14 20:13 Comments on Ceph distributed parity implementation Martin Flyvbjerg
2013-06-14 20:29 ` Mark Nelson
2013-06-14 21:05 ` Joe Buck
2013-06-14 22:57 ` Loic Dachary
2013-06-15 1:12 ` Paul Von-Stamwitz
2013-06-15 6:51 ` Loic Dachary
2013-06-16 19:51 ` Benoît Parrein
2013-06-16 21:31 ` Loic Dachary
2013-06-17 16:48 ` Benoît Parrein
2013-06-17 16:55 ` Paul Von-Stamwitz
2013-06-18 7:44 ` Benoît Parrein
2013-06-18 14:22 ` James Plank
2013-06-19 1:35 ` Paul Von-Stamwitz
2013-06-20 18:25 ` Loic Dachary [this message]
2013-06-21 1:23 ` Paul Von-Stamwitz
2013-06-21 8:29 ` Loic Dachary
2013-06-22 0:08 ` Paul Von-Stamwitz
2013-06-22 8:26 ` Loic Dachary
2013-06-24 2:26 ` Harvey Skinner
[not found] ` <C395B77B849187439280E1CF5FE1F2FA8990491B@G9W0337.americas.hpqcorp.net>
[not found] ` <CAJOObidVdjtiwk+xk5rwZi4=DBZ9GvTQnAkteCC0OhB_vyg6pg@mail.gmail.com>
[not found] ` <CAJOObicNGkweZbVSR-V8NA9YXaZucUpNm0y8Ph3X7EkE=pRG5g@mail.gmail.com>
2013-06-18 14:31 ` Harvey Skinner
2013-06-18 15:46 ` Loic Dachary
2013-06-15 7:30 ` Loic Dachary
2013-06-15 9:40 ` Leen Besselink
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51C34932.4080304@dachary.org \
--to=loic@dachary.org \
--cc=ceph-devel@vger.kernel.org \
--cc=plank@cs.utk.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.