From: Nikita Danilov <nikita@clusterfs.com>
To: David Lang <david.lang@digitalinsight.com>
Cc: Amit Gud <gud@cis.ksu.edu>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
val_henson@linux.intel.com, riel@surriel.com, zab@zabbo.net,
arjan@infradead.org, suparna@in.ibm.com, brandon@ifup.org,
karunasagark@gmail.com, gud@ksu.edu
Subject: Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck
Date: Tue, 24 Apr 2007 23:34:48 +0400 [thread overview]
Message-ID: <17966.23512.363955.141489@gargle.gargle.HOWL> (raw)
In-Reply-To: <Pine.LNX.4.63.0704241123200.7701@qynat.qvtvafvgr.pbz>
David Lang writes:
> On Tue, 24 Apr 2007, Nikita Danilov wrote:
>
> > Amit Gud writes:
> >
> > Hello,
> >
> > >
> > > This is an initial implementation of ChunkFS technique, briefly discussed
> > > at: http://lwn.net/Articles/190222 and
> > > http://cis.ksu.edu/~gud/docs/chunkfs-hotdep-val-arjan-gud-zach.pdf
> >
> > I have a couple of questions about chunkfs repair process.
> >
> > First, as I understand it, each continuation inode is a sparse file,
> > mapping some subset of logical file blocks into block numbers. Then it
> > seems, that during "final phase" fsck has to check that these partial
> > mappings are consistent, for example, that no two different continuation
> > inodes for a given file contain a block number for the same offset. This
> > check requires scan of all chunks (rather than of only "active during
> > crash"), which seems to return us back to the scalability problem
> > chunkfs tries to address.
>
> not quite.
>
> this checking is a O(n^2) or worse problem, and it can eat a lot of memory in
> the process. with chunkfs you divide the problem by a large constant (100 or
> more) for the checks of individual chunks. after those are done then the final
> pass checking the cross-chunk links doesn't have to keep track of everything, it
> only needs to check those links and what they point to
Maybe I failed to describe the problem presicely.
Suppose that all chunks have been checked. After that, for every inode
I0 having continuations I1, I2, ... In, one has to check that every
logical block is presented in at most one of these inodes. For this one
has to read I0, with all its indirect (double-indirect, triple-indirect)
blocks, then read I1 with all its indirect blocks, etc. And to repeat
this for every inode with continuations.
In the worst case (every inode has a continuation in every chunk) this
obviously is as bad as un-chunked fsck. But even in the average case,
total amount of io necessary for this operation is proportional to the
_total_ file system size, rather than to the chunk size.
>
> any ability to mark a filesystem as 'clean' and then not have to check it on
> reboot is a bonus on top of this.
>
> David Lang
Nikita.
next prev parent reply other threads:[~2007-04-24 19:34 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-23 11:21 [RFC][PATCH] ChunkFS: fs fission for faster fsck Amit Gud
[not found] ` <17965.6084 1.900376.524639@gargle.gargle.HOWL>
2007-04-23 16:28 ` Suparna Bhattacharya
2007-04-23 15:25 ` Amit Gud
2007-04-23 16:32 ` Suparna Bhattacharya
2007-04-24 11:44 ` Nikita Danilov
2007-04-24 18:27 ` David Lang
2007-04-24 19:34 ` Nikita Danilov [this message]
2007-04-24 19:26 ` David Lang
2007-04-25 11:34 ` Nikita Danilov
2007-04-25 16:39 ` David Lang
2007-04-25 22:47 ` Valerie Henson
2007-04-26 14:14 ` Jeff Dike
2007-04-26 15:53 ` Amit Gud
2007-04-26 16:05 ` Jeff Dike
2007-04-26 16:56 ` Amit Gud
2007-04-27 4:58 ` Valerie Henson
2007-04-27 15:06 ` Jeff Dike
2007-05-01 17:26 ` Valerie Henson
2007-04-26 16:11 ` Alan Cox
2007-04-26 16:44 ` Amit Gud
2007-04-24 21:53 ` Amit Gud
2007-04-25 10:54 ` David Chinner
2007-04-25 11:38 ` Andreas Dilger
2007-04-25 17:52 ` Amit Gud
2007-04-25 23:06 ` Valerie Henson
2007-04-25 23:03 ` Valerie Henson
2007-04-26 0:47 ` David Chinner
2007-04-26 22:21 ` Jörn Engel
2007-04-26 8:47 ` Jan Kara
2007-04-27 5:07 ` Valerie Henson
2007-04-27 10:53 ` Jörn Engel
2007-04-28 6:50 ` Valerie Henson
2007-04-28 10:03 ` Jörn Engel
2007-04-25 22:43 ` Valerie Henson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17966.23512.363955.141489@gargle.gargle.HOWL \
--to=nikita@clusterfs.com \
--cc=arjan@infradead.org \
--cc=brandon@ifup.org \
--cc=david.lang@digitalinsight.com \
--cc=gud@cis.ksu.edu \
--cc=gud@ksu.edu \
--cc=karunasagark@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=riel@surriel.com \
--cc=suparna@in.ibm.com \
--cc=val_henson@linux.intel.com \
--cc=zab@zabbo.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).