public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Amit Gud <gud@ksu.edu>
To: Nikita Danilov <nikita@clusterfs.com>
Cc: David Lang <david.lang@digitalinsight.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	val_henson@linux.intel.com, riel@surriel.com, zab@zabbo.net,
	arjan@infradead.org, suparna@in.ibm.com, brandon@ifup.org,
	karunasagark@gmail.com
Subject: Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck
Date: Tue, 24 Apr 2007 16:53:11 -0500	[thread overview]
Message-ID: <462E7C47.8080604@ksu.edu> (raw)
In-Reply-To: <17966.23512.363955.141489@gargle.gargle.HOWL>

Nikita Danilov wrote:
> Maybe I failed to describe the problem presicely.
> 
> Suppose that all chunks have been checked. After that, for every inode
> I0 having continuations I1, I2, ... In, one has to check that every
> logical block is presented in at most one of these inodes. For this one
> has to read I0, with all its indirect (double-indirect, triple-indirect)
> blocks, then read I1 with all its indirect blocks, etc. And to repeat
> this for every inode with continuations.
> 
> In the worst case (every inode has a continuation in every chunk) this
> obviously is as bad as un-chunked fsck. But even in the average case,
> total amount of io necessary for this operation is proportional to the
> _total_ file system size, rather than to the chunk size.
> 

Perhaps, I should talk about how continuation inodes are managed / 
located on disk. (This is how it is in my current implementation)

Right now, there is no distinction between an inode and continuation 
inode (also referred to as 'cnode' below), except for the 
EXT2_IS_CONT_FL flag. Every inode holds a list of static number of 
inodes, currently limited to 4.

The structure looks like this:

  ----------		----------
| cnode 0  |---------->| cnode 0  |----------> to another cnode or NULL
  ----------		----------
| cnode 1  |-----      | cnode 1  |-----
  ----------	|	----------	|
| cnode 2  |--	|      | cnode 2  |--   |
  ----------  |	|	----------  |   |
| cnode 3  | |	|      | cnode 3  | |   |
  ----------  |	|	----------  |   |
	  |  |  |		 |  |   |

	   inodes		inodes or NULL

I.e. only first cnode in the list carries forward the chain if all the 
slots are occupied.

Every cnode# field contains
{
	ino_t cnode;
	__u32 start;	/* starting logical block number */
	__u32 end;	/* ending logical block number */
}

(current implementation has just one field: cnode)

I thought of this structure to avoid recursion and / or use of any data 
structure while traversing the continuation inodes.

Additional flag, EXT2_SPARSE_CONT_FL would indicate whether the inode 
has any sparse portions. 'start' and 'end' fields are used to speed-up 
finding a cnode given a logical block number without the need of 
actually reading the inode - this can be done away with, perhaps more 
conveniently by, pinning the cnodes in memory as and when read.

Now going back to the Nikita's question, all the cnodes for an inode 
need to be scanned iff 'start' field or number of blocks or flag 
EXT2_SPARSE_CONT_FL in any of its cnodes is altered.

And yes, the whole attempt is to reduce the number of continuation inodes.

Comments, suggestions welcome.


AG
-- 
May the source be with you.
http://www.cis.ksu.edu/~gud


  parent reply	other threads:[~2007-04-24 21:54 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-23 11:21 [RFC][PATCH] ChunkFS: fs fission for faster fsck Amit Gud
     [not found] ` <17965.6084 1.900376.524639@gargle.gargle.HOWL>
2007-04-23 16:28 ` Suparna Bhattacharya
2007-04-23 15:25   ` Amit Gud
2007-04-23 16:32   ` Suparna Bhattacharya
2007-04-24 11:44 ` Nikita Danilov
2007-04-24 18:27   ` David Lang
2007-04-24 19:34     ` Nikita Danilov
2007-04-24 19:26       ` David Lang
2007-04-25 11:34         ` Nikita Danilov
2007-04-25 16:39           ` David Lang
2007-04-25 22:47           ` Valerie Henson
2007-04-26 14:14             ` Jeff Dike
2007-04-26 15:53               ` Amit Gud
2007-04-26 16:05                 ` Jeff Dike
2007-04-26 16:56                   ` Amit Gud
2007-04-27  4:58                   ` Valerie Henson
2007-04-27 15:06                     ` Jeff Dike
2007-05-01 17:26                       ` Valerie Henson
2007-04-26 16:11                 ` Alan Cox
2007-04-26 16:44                   ` Amit Gud
2007-04-24 21:53       ` Amit Gud [this message]
2007-04-25 10:54         ` David Chinner
2007-04-25 11:38           ` Andreas Dilger
2007-04-25 17:52             ` Amit Gud
2007-04-25 23:06             ` Valerie Henson
2007-04-25 23:03           ` Valerie Henson
2007-04-26  0:47             ` David Chinner
2007-04-26 22:21               ` Jörn Engel
2007-04-26  8:47             ` Jan Kara
2007-04-27  5:07               ` Valerie Henson
2007-04-27 10:53                 ` Jörn Engel
2007-04-28  6:50                   ` Valerie Henson
2007-04-28 10:03                     ` Jörn Engel
2007-04-25 22:43       ` Valerie Henson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=462E7C47.8080604@ksu.edu \
    --to=gud@ksu.edu \
    --cc=arjan@infradead.org \
    --cc=brandon@ifup.org \
    --cc=david.lang@digitalinsight.com \
    --cc=karunasagark@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nikita@clusterfs.com \
    --cc=riel@surriel.com \
    --cc=suparna@in.ibm.com \
    --cc=val_henson@linux.intel.com \
    --cc=zab@zabbo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox