All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Stephen Perkins" <perkins@netmass.com>
To: ceph-devel@vger.kernel.org
Subject: Lazy Erasure Coding for RADOS
Date: Mon, 6 Aug 2012 09:50:29 -0500	[thread overview]
Message-ID: <00b201cd73e2$d2cf5a60$786e0f20$@netmass.com> (raw)

Hi all,

I would like to build a fully geo-redundant and highly available storage
solution.  I read a research paper that describes the architecture of the
Microsoft Azure deployment (looking to hit several hundred petabytes soon).
This was presented at the 23rd ACM Symposium on Operating System Principles.
Information and paper here:  

http://blogs.msdn.com/b/windowsazure/archive/2011/11/21/windows-azure-storag
e-a-highly-available-cloud-storage-service-with-strong-consistency.aspx

The thing I took away from it was that Microsoft considered 3 copies locally
to be the minimum number required for protection.  However, they also
realized that you cannot afford to scale to an Exabyte with a 3x overhead
for storage.  So. they have a lazy process that goes around and behind the
scenes and converts objects stored with 3X redundancy to an object that is
erasure coded with Reed-Solomon having a 1.3 or 1.6 overhead.   At the same
time, the RS coding provides a better long term availability than the 3x
replication approach.

Specifics of the RS coding are here (best paper award at Usenix):  

https://www.usenix.org/conference/usenixfederatedconferencesweek/erasure-cod
ing-windows-azure-storage

As far as I have found, there are two implementations of R-S coded object
stores out there:
                Commercial - Cleversafe (http://www.cleversafe.com/)
                Open Source - Tahoe-LAFS (http://www.tahoe-lafs.org/)

Given a certain availability metric, stronger erasure coding can make a HUGE
difference in the cost of deployment.  See "Erasure Coding vs Replication: A
Quantitative Comparison" here:
http://oceanstore.cs.berkeley.edu/publications/papers/pdf/erasure_iptps.pdf

Has any thought been given to implementing stronger erasure coding in RADOS
(either directly or in a lazy fashion)?

Thanks in advance for any thoughts,

- Steve

---
Stephen Perkins
NetMass Incorporated
800-731-2737 x5005
+1-972-838-1520 x5005
perkins@netmass.com
 
NetMassT
The safe data company.



             reply	other threads:[~2012-08-06 14:50 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-06 14:50 Stephen Perkins [this message]
2012-08-06 16:19 ` Lazy Erasure Coding for RADOS Gregory Farnum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='00b201cd73e2$d2cf5a60$786e0f20$@netmass.com' \
    --to=perkins@netmass.com \
    --cc=ceph-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.