All of lore.kernel.org
 help / color / mirror / Atom feed
From: Loic Dachary <loic@dachary.org>
To: Noah Watkins <jayhawk@cs.ucsc.edu>
Cc: Ceph Development <ceph-devel@vger.kernel.org>
Subject: Re: Erasure encoding as a storage backend
Date: Sat, 04 May 2013 21:26:45 +0200	[thread overview]
Message-ID: <518560F5.3090306@dachary.org> (raw)
In-Reply-To: <C38EEFE7-DC29-4325-B268-5BB85B1E4D9A@cs.ucsc.edu>

[-- Attachment #1: Type: text/plain, Size: 2946 bytes --]



On 05/04/2013 08:47 PM, Noah Watkins wrote:
> 
> On May 4, 2013, at 11:36 AM, Loic Dachary <loic@dachary.org> wrote:
> 
>>
>>
>> On 05/04/2013 08:27 PM, Noah Watkins wrote:
>>>
>>> On May 4, 2013, at 10:16 AM, Loic Dachary <loic@dachary.org> wrote:
>>>
>>>> it would be great to get feedback before the ceph summit to address the most prominent issues.
>>>
>>> One thing that has been in the back of my mind is how this proposal is influenced (if at all) by a future that includes declustered per-file raid in CephFS. I realize that may be a distant future, but it seems as though there could be a lot of overlap for the (non-client driven) rebuild/recovery component of such an architecture.
>>
>> Hi Noah,
>>
>> I'm not sure what declustered per-file raid is, which means it had no influence on this proposal ;-) Would you be so kind as to educate me ?
> 
> I'm definitely far from an expert on the topic. But briefly the way I think about it is:
> 
> Currently CephFS stripes a file byte stream across a set of objects (e.g. first MB in object 0, 2nd in object 1, etc..), and each of these objects is in turn replicated. Following a failure, PGs re-replicate objects.
> 
> In client drive raid the striping algorithm is changed, and clients are calculating and distributing parity. In this case the parity rather than replication provides redundancy. So, one might consider storing objects in a pool with replication size 1. However, the standard PG that does replication wouldn't be able to handle faults correctly (parity rebuild, rather than re-replication), and a smart PG like the ErasureCodedPG would be needed.
> 
> So it seems like the problems are related, but I'm not sure exactly how much overlap there is :)

Do you refer to http://ceph.com/docs/master/architecture/#how-ceph-clients-stripe-data when talking about client drive raid ? My understanding is that it is designed to maximize throughout. This is done in the client library ( gateway, rbd or cephfs ). Since erasure encoding is about recovering from failures and would be implemented in libosd ( next to ReplicatedPG ), I am under the impression that there is no overlap.

What do you think ?

> 
> -Noah
> 
> 
>> Cheers
>>
>>> -Noah
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> -- 
>> Loïc Dachary, Artisan Logiciel Libre
>> All that is necessary for the triumph of evil is that good people do nothing.
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

  reply	other threads:[~2013-05-04 19:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-04 17:16 Erasure encoding as a storage backend Loic Dachary
2013-05-04 18:27 ` Noah Watkins
2013-05-04 18:36   ` Loic Dachary
2013-05-04 18:47     ` Noah Watkins
2013-05-04 19:26       ` Loic Dachary [this message]
2013-05-05  4:51       ` Gregory Farnum
2013-05-05 14:51         ` Noah Watkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=518560F5.3090306@dachary.org \
    --to=loic@dachary.org \
    --cc=ceph-devel@vger.kernel.org \
    --cc=jayhawk@cs.ucsc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.