* Lazy Erasure Coding for RADOS
@ 2012-08-06 14:50 Stephen Perkins
2012-08-06 16:19 ` Gregory Farnum
0 siblings, 1 reply; 2+ messages in thread
From: Stephen Perkins @ 2012-08-06 14:50 UTC (permalink / raw)
To: ceph-devel
Hi all,
I would like to build a fully geo-redundant and highly available storage
solution. I read a research paper that describes the architecture of the
Microsoft Azure deployment (looking to hit several hundred petabytes soon).
This was presented at the 23rd ACM Symposium on Operating System Principles.
Information and paper here:
http://blogs.msdn.com/b/windowsazure/archive/2011/11/21/windows-azure-storag
e-a-highly-available-cloud-storage-service-with-strong-consistency.aspx
The thing I took away from it was that Microsoft considered 3 copies locally
to be the minimum number required for protection. However, they also
realized that you cannot afford to scale to an Exabyte with a 3x overhead
for storage. So. they have a lazy process that goes around and behind the
scenes and converts objects stored with 3X redundancy to an object that is
erasure coded with Reed-Solomon having a 1.3 or 1.6 overhead. At the same
time, the RS coding provides a better long term availability than the 3x
replication approach.
Specifics of the RS coding are here (best paper award at Usenix):
https://www.usenix.org/conference/usenixfederatedconferencesweek/erasure-cod
ing-windows-azure-storage
As far as I have found, there are two implementations of R-S coded object
stores out there:
Commercial - Cleversafe (http://www.cleversafe.com/)
Open Source - Tahoe-LAFS (http://www.tahoe-lafs.org/)
Given a certain availability metric, stronger erasure coding can make a HUGE
difference in the cost of deployment. See "Erasure Coding vs Replication: A
Quantitative Comparison" here:
http://oceanstore.cs.berkeley.edu/publications/papers/pdf/erasure_iptps.pdf
Has any thought been given to implementing stronger erasure coding in RADOS
(either directly or in a lazy fashion)?
Thanks in advance for any thoughts,
- Steve
---
Stephen Perkins
NetMass Incorporated
800-731-2737 x5005
+1-972-838-1520 x5005
perkins@netmass.com
NetMassT
The safe data company.
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Lazy Erasure Coding for RADOS
2012-08-06 14:50 Lazy Erasure Coding for RADOS Stephen Perkins
@ 2012-08-06 16:19 ` Gregory Farnum
0 siblings, 0 replies; 2+ messages in thread
From: Gregory Farnum @ 2012-08-06 16:19 UTC (permalink / raw)
To: Stephen Perkins; +Cc: ceph-devel@vger.kernel.org
On Monday, August 6, 2012, Stephen Perkins wrote:
>
> Hi all,
>
> I would like to build a fully geo-redundant and highly available storage
> solution. I read a research paper that describes the architecture of the
> Microsoft Azure deployment (looking to hit several hundred petabytes soon).
> This was presented at the 23rd ACM Symposium on Operating System Principles.
> Information and paper here:
>
> http://blogs.msdn.com/b/windowsazure/archive/2011/11/21/windows-azure-storag
> e-a-highly-available-cloud-storage-service-with-strong-consistency.aspx
>
> The thing I took away from it was that Microsoft considered 3 copies locally
> to be the minimum number required for protection. However, they also
> realized that you cannot afford to scale to an Exabyte with a 3x overhead
> for storage. So. they have a lazy process that goes around and behind the
> scenes and converts objects stored with 3X redundancy to an object that is
> erasure coded with Reed-Solomon having a 1.3 or 1.6 overhead. At the same
> time, the RS coding provides a better long term availability than the 3x
> replication approach.
>
> Specifics of the RS coding are here (best paper award at Usenix):
>
> https://www.usenix.org/conference/usenixfederatedconferencesweek/erasure-cod
> ing-windows-azure-storage
>
> As far as I have found, there are two implementations of R-S coded object
> stores out there:
> Commercial - Cleversafe (http://www.cleversafe.com/)
> Open Source - Tahoe-LAFS (http://www.tahoe-lafs.org/)
>
> Given a certain availability metric, stronger erasure coding can make a HUGE
> difference in the cost of deployment. See "Erasure Coding vs Replication: A
> Quantitative Comparison" here:
> http://oceanstore.cs.berkeley.edu/publications/papers/pdf/erasure_iptps.pdf
>
> Has any thought been given to implementing stronger erasure coding in RADOS
> (either directly or in a lazy fashion)?
It's been thought about in the "RADOS should support erasure codes
instead of just replication" sense, but not in the "we would do this
to implement it" sense. I don't know how Azure's storage system works
(will need to check out that paper!), but implementing erasure coding
in the OSDs would essentially require re-implementing or extending all
of their difficult code, which is obviously not something we're eager
to do at this time.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-08-06 16:19 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-06 14:50 Lazy Erasure Coding for RADOS Stephen Perkins
2012-08-06 16:19 ` Gregory Farnum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.