linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Anderson-Lee <jonah@eecs.berkeley.edu>
To: Jeff Garzik <jeff@garzik.org>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: petabyte class archival filestore wanted/proposed
Date: Thu, 22 Jun 2006 13:29:18 -0700	[thread overview]
Message-ID: <449AFD9E.3070308@eecs.berkeley.edu> (raw)
In-Reply-To: <449AF53B.10103@garzik.org>

Jeff Garzik wrote:

> Jeff Anderson-Lee wrote:
>
>> I'm part of a project at University of California Berkeley that is 
>> trying to put together a predominantly archival file system for 
>> petabyte class data stores using Linux with clusters of commodity 
>> server hardware.  We currently have multiple terabytes of hardware on 
>> top of which we intend to build such a system.  However, our hope is 
>> that the end system would be useful for a wide range of users from 
>> someone with 3 large disk or three disk servers to groups with 3 or 
>> more distributed storage sites.
>>
>> Main Goals/Features:
>>    1) Tapeless: maintain multiple copies on disk (minimize 
>> backup/restore lag)
>>    2) "Mirroring" across remote sites: for disaster recovery (we sit 
>> on top of  the Hayward Fault)
>>    3) Persistent snapshots: as archival copies instead of 
>> backup/restore scanning
>>    4) Copy-On-Write: in support of snapshots/archives
>>    5) Append-mostly log structured file system: make synchronization 
>> of remote mirrors easier (tail the log).
>>    6) Avoid (insofar as possible) single point of failure and 
>> bottlenecks (for scalability)
>>
>> I've looked into the existing file systems I know about, and none of 
>> them seem to fit the bill.
>>
>> Parts of the Open Solaris ZFS file system looks interesting, except 
>> (a) it is not on Linux and (b) seems to mix together too many levels 
>> (volume manager and file system).  I can see how using some of the 
>> concepts and implementing something like it on top of an 
>> append-mostly distributed logical device might work however.  By 
>> splitting the project into two parts ((a) a robust, distributed 
>> logical block device and (b) a flexible file system with snapshots)  
>> it might make it easier to design and build.
>>
>> Before we begin however, it is important to find out:
>>    1) Is there anything sufficiently like this to either (a) use 
>> instead, or (b) start from.
>>    2) Is there community support for insertion in the main kernel 
>> tree (without which it is just another toy project)?
>>    3) Anyone care to join in (a) design, (b) implementation, or (c) 
>> testing?
>
>
> I would recommend checking out Venti:
> http://cm.bell-labs.com/sys/doc/venti.html 

Yes, I've seen that and like some of the ideas.  There is no GPL Linux 
implementation of Venti that I know of.

Jeff Anderson-Lee


  reply	other threads:[~2006-06-22 20:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-22 16:43 petabyte class archival filestore wanted/proposed Jeff Anderson-Lee
2006-06-22 18:19 ` Bryan Henderson
2006-06-22 18:58   ` Jeff Anderson-Lee
2006-06-23  0:57     ` Bryan Henderson
2006-06-22 19:53 ` Jeff Garzik
2006-06-22 20:29   ` Jeff Anderson-Lee [this message]
2006-06-23  4:26 ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=449AFD9E.3070308@eecs.berkeley.edu \
    --to=jonah@eecs.berkeley.edu \
    --cc=jeff@garzik.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).