From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Dilger Subject: Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck Date: Wed, 25 Apr 2007 05:38:34 -0600 Message-ID: <20070425113834.GT5967@schatzie.adilger.int> References: <17965.60841.900376.524639@gargle.gargle.HOWL> <17966.23512.363955.141489@gargle.gargle.HOWL> <462E7C47.8080604@ksu.edu> <20070425105434.GX32602149@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Amit Gud , Nikita Danilov , David Lang , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, val_henson@linux.intel.com, riel@surriel.com, zab@zabbo.net, arjan@infradead.org, suparna@in.ibm.com, brandon@ifup.org, karunasagark@gmail.com To: David Chinner Return-path: Received: from mail.clusterfs.com ([206.168.112.78]:48763 "EHLO mail.clusterfs.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2992723AbXDYLih (ORCPT ); Wed, 25 Apr 2007 07:38:37 -0400 Content-Disposition: inline In-Reply-To: <20070425105434.GX32602149@melbourne.sgi.com> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Apr 25, 2007 20:54 +1000, David Chinner wrote: > On Tue, Apr 24, 2007 at 04:53:11PM -0500, Amit Gud wrote: > > Right now, there is no distinction between an inode and continuation > > inode (also referred to as 'cnode' below), except for the > > EXT2_IS_CONT_FL flag. Every inode holds a list of static number of > > inodes, currently limited to 4. > > > > The structure looks like this: > > > > ---------- ---------- > > | cnode 0 |---------->| cnode 0 |----------> to another cnode or NULL > > ---------- ---------- > > | cnode 1 |----- | cnode 1 |----- > > ---------- | ---------- | > > | cnode 2 |-- | | cnode 2 |-- | > > ---------- | | ---------- | | > > | cnode 3 | | | | cnode 3 | | | > > ---------- | | ---------- | | > > | | | | | | > > > > inodes inodes or NULL > > How do you recover if fsfuzzer takes out a cnode in the chain? The > chunk is marked clean, but clearly corrupted and needs fixing and > you don't know what it was pointing at. Hence you have a pointer to > a trashed cnode *somewhere* that you need to find and fix, and a > bunch of orphaned cnodes that nobody points to *somewhere else* in > the filesystem that you have to find. That's a full scan fsck case, > isn't? Presumably, the cnodes in the other chunks contain forward and back references. Those need to contain at minimum inode + generation + chunk to avoid problem of pointing to a _different_ inode after such corruption caused the old inode to be deleted and a new one allocated in its place. If the cnode in each chunk is more than just a singly-linked list, the file as a whole could survive multiple chunk corruptions, though there would now be holes in the file. > It seems that any sort of damage to the underlying storage (e.g. > media error, I/O error or user brain explosion) results in the need > to do a full fsck and hence chunkfs gives you no benefit in this > case. There are several cases where such corruption could be found: - file access from the "parent" cnode will be missing corrupted cnode, probably causing a fsck of both the source and target chunks - a fsck of the source chunk would find the dangling cnode reference and cause a fsck of the corrupt chunk - a fsck of the later cnode chunks would find the dangling cnode reference and cause a fsck of the corrupt chunk - a fsck of the corrupt chunk would find the original corruption The case where only a fsck of the corrupt chunk is done would not find the cnode references. Maybe there needs to be per-chunk info which contains a list/bitmap of other chunks that have cnodes shared with each chunk? Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.