From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jeff Garzik <jeff@garzik.org>
Subject: Re: Continuation Inodes Explained! (was Re: [RFC 0/13]
 extents and 48bit ext3)
Date: Sat, 10 Jun 2006 10:22:08 -0400
Message-ID: <448AD590.40005@garzik.org>
References: <1149816055.4066.60.camel@dyn9047017069.beaverton.ibm.com>
	<4488E1A4.20305@garzik.org>
	<20060609083523.GQ5964@schatzie.adilger.int>
	<44898EE3.6080903@garzik.org> <m3r71ycprd.fsf@bzzz.home.net>
	<20060609153116.GM1651@parisc-linux.org>
	<20060610032623.GG10524@goober>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Cc: Andrew Morton <akpm@osdl.org>, Matthew Wilcox <matthew@wil.cx>,
	Arjan van de Ven <arjan@linux.intel.com>,
	ext2-devel <ext2-devel@lists.sourceforge.net>,
	linux-kernel@vger.kernel.org, Linus Torvalds <torvalds@osdl.org>,
	cmm@us.ibm.com, linux-fsdevel@vger.kernel.org,
	Alex Tomas <alex@clusterfs.com>,
	Andreas Dilger <adilger@clusterfs.com>
Return-path: <ext2-devel-bounces@lists.sourceforge.net>
To: Valerie Henson <val_henson@linux.intel.com>
In-Reply-To: <20060610032623.GG10524@goober>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/ext2-devel>,
	<mailto:ext2-devel-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum=ext2-devel>
List-Post: <mailto:ext2-devel@lists.sourceforge.net>
List-Help: <mailto:ext2-devel-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/ext2-devel>,
	<mailto:ext2-devel-request@lists.sourceforge.net?subject=subscribe>
Sender: ext2-devel-bounces@lists.sourceforge.net
Errors-To: ext2-devel-bounces@lists.sourceforge.net
List-Id: linux-fsdevel.vger.kernel.org

Valerie Henson wrote:
> So what the heck are continuation inodes?  Actually, we named this
> "chunkfs" - not particularly descriptive, maybe continuation inodes is
> a better term.
[...]
> The basic idea is to create a bunch of small file systems - chunks -
> which look like one big file system to the administrator.  Major

Back when I was still playing with my experimental filesystem, one of 
the short-list features I was planning on implementing was the 
allocation of both metadata and data from the same underlying data 
store, essentially collections of "buckets" for data.

The data store would be a succession of progressively-smaller buckets. 
Typical bucket sizes (chosen by admin) on a single filesystem might be: 
1G, 128M, 4M, 1M, 64k, 4k.  The largest (top-most) bucket is the 
fundamental unit of allocation for the filesystem, from which all other 
metadata and data is read/allocated.

So in my example above, the 1G bucket is analagous to a single chunk in 
chunkfs, and any number of 1G buckets -- from any number of block 
devices -- may comprise a single filesystem.

New inode tables, bitmap chunks, directories, large files, etc. are all 
allocated from an "appropriate" bucket.  IMO this type of solution 
provides fsck-friendly isolation, and adds sufficient flexibility for 
doing things like delayed alloc, metadata-is-a-file, etc.

	Jeff