From: Alex Elsayed <eternaleye@gmail.com>
To: ceph-devel@vger.kernel.org
Subject: Re: Erasure code library summary
Date: Wed, 19 Jun 2013 00:47:56 -0700 [thread overview]
Message-ID: <kprnn8$l72$1@ger.gmane.org> (raw)
In-Reply-To: 51C156FA.2000509@dachary.org
Loic Dachary wrote:
>
>
> On 06/19/2013 03:14 AM, Alex Elsayed wrote:
>> Alex Elsayed wrote:
>>
>>> Loic Dachary wrote:
>>>
>>>> Hi Ceph,
>>>>
>>> <snip>
>>>> Reed-Solomon coding family is the only one that can keep the chuncks
>>>> unencoded and therefore concatenable.
>>> <snip>
>>>
>>> In my understanding, this is not strictly true - any 'systematic' code
>>> will have the unencoded chunks remain available in this manner, and any
>>> non- systematic linear code can be transformed into a systematic code
>>> with the same minimum distance. Fountain codes are often explicitly
>>> constructed to maintain this property, as in the case of RaptorQ [RFC
>>> 6330].
>>>
>>> https://en.wikipedia.org/wiki/Systematic_code
>>
>> ...that said, Reed-Solomon is to the best of my knowledge the only space-
>> optimal such code.
>
> What does "space-optimal" mean ? Does it mean that Reed-Solomon will use
> less disk space than fountain codes to code the same number of parity
> chunks ?
Optimal (for an erasure code) means that if you have K symbols of real data,
then *any* K symbols of the output of the erasure code will let you recover
it.
Current fountain codes (RaptorQ is best-of-breed right now as far as I know)
require K + epsilon, and while epsilon is zero for the vast majority of
cases, some K-sized subsets of the total list of encoded symbols have a non-
zero epsilon, thus requiring more parity data to get exactly the same level
of assurance.
Optimal erasure codes are also known as "Maximum Distance Separable" codes.
>> An interesting option, however, might be to use a
>> fountain code over the network when distributing either replicas *or*
>> parity chunks, so that losses can be recovered with <1 full chunk
>> retransmission.
>
> I would be gratefull if you could expand on this idea. I don't get it :-)
First, a couple caveats - one, doing this over TCP would yield no real
benefit. In fact, any reliable transport makes this mostly pointless - the
idea is to avoid retransmitting not only chunks, but packets as well.
Let's assume 4MB chunks. Encode the chunk as a single source block (Raptor
terminology, see the RFC), with a symbol size chosen to fit 1 (one) symbol
comfortably into a single packet of whatever unreliable, unordered transport
you're using. DCCP is basically perfect for this.
Send the symbols taking advantage of RaptorQ being a systematic code, and
thus sending the unmodified chunk first. If it gets through okay, the
receiver closes the connection and you're done.
If one or more packets failed to get through, those are erasures - so the
receiver leaves the connection open. The sender can be really simplistic -
'keep encoding and sending symbols as long as the connection is open.' Once
the receiver has enough symbols to recover, it closes the connection.
In cases of no loss, overhead is zero. In cases of some loss, the number of
additional packets is equal to the number of lost packets plus a (very
small) potential overhead. The real benefit here is this:
There is no longer any need to wait a syn/ack cycle to realize a packet was
lost.
This is the use case fountain codes are optimized for - coding for
transmission. Creating a new symbol is an O(1) operation for RaptorQ, while
for Reed-Solomon it's O(N) with the size of the source block.
Another neat property with Raptor codes is that you can have multiple,
unsynchronized senders - so for replicas, once one replica has succeeded it
could join in to accelerate it *linearly* without needing to track who had
which symbols in the chunk.
Multicast, too.
> Cheers
>
next prev parent reply other threads:[~2013-06-19 7:48 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-18 12:22 Erasure code library summary Loic Dachary
2013-06-19 1:10 ` Alex Elsayed
2013-06-19 1:14 ` Alex Elsayed
2013-06-19 7:00 ` Loic Dachary
2013-06-19 7:47 ` Alex Elsayed [this message]
2013-06-19 8:33 ` Loic Dachary
2013-06-19 9:09 ` Alex Elsayed
2013-06-19 10:41 ` Loic Dachary
2013-06-19 6:56 ` Loic Dachary
2013-06-19 11:33 ` Mark Nelson
2013-06-19 12:10 ` Loic Dachary
2013-06-19 12:33 ` Mark Nelson
2013-06-23 7:01 ` Loic Dachary
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='kprnn8$l72$1@ger.gmane.org' \
--to=eternaleye@gmail.com \
--cc=ceph-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.