linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* petabyte class archival filestore wanted/proposed
@ 2006-06-22 16:43 Jeff Anderson-Lee
  2006-06-22 18:19 ` Bryan Henderson
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Jeff Anderson-Lee @ 2006-06-22 16:43 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel

I'm part of a project at University of California Berkeley that is 
trying to put together a predominantly archival file system for petabyte 
class data stores using Linux with clusters of commodity server 
hardware.  We currently have multiple terabytes of hardware on top of 
which we intend to build such a system.  However, our hope is that the 
end system would be useful for a wide range of users from someone with 3 
large disk or three disk servers to groups with 3 or more distributed 
storage sites.

Main Goals/Features:
    1) Tapeless: maintain multiple copies on disk (minimize 
backup/restore lag)
    2) "Mirroring" across remote sites: for disaster recovery (we sit on 
top of  the Hayward Fault)
    3) Persistent snapshots: as archival copies instead of 
backup/restore scanning
    4) Copy-On-Write: in support of snapshots/archives
    5) Append-mostly log structured file system: make synchronization of 
remote mirrors easier (tail the log).
    6) Avoid (insofar as possible) single point of failure and 
bottlenecks (for scalability)
 
I've looked into the existing file systems I know about, and none of 
them seem to fit the bill.

Parts of the Open Solaris ZFS file system looks interesting, except (a) 
it is not on Linux and (b) seems to mix together too many levels (volume 
manager and file system).  I can see how using some of the concepts and 
implementing something like it on top of an append-mostly distributed 
logical device might work however.  By splitting the project into two 
parts ((a) a robust, distributed logical block device and (b) a flexible 
file system with snapshots)  it might make it easier to design and build.

Before we begin however, it is important to find out:
    1) Is there anything sufficiently like this to either (a) use 
instead, or (b) start from.
    2) Is there community support for insertion in the main kernel tree 
(without which it is just another toy project)?
    3) Anyone care to join in (a) design, (b) implementation, or (c) 
testing?

I have been contemplating this for some time and do have some ideas that 
I would be happy to share with any and all interested.

Jeff Anderson-Lee
Petabyte Storage Infrastructure Project
University of California at Berkeley
  


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-06-23  4:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-22 16:43 petabyte class archival filestore wanted/proposed Jeff Anderson-Lee
2006-06-22 18:19 ` Bryan Henderson
2006-06-22 18:58   ` Jeff Anderson-Lee
2006-06-23  0:57     ` Bryan Henderson
2006-06-22 19:53 ` Jeff Garzik
2006-06-22 20:29   ` Jeff Anderson-Lee
2006-06-23  4:26 ` Andreas Dilger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).