All of lore.kernel.org
 help / color / mirror / Atom feed
From: Valerie Henson <val_henson@linux.intel.com>
To: David Chinner <dgc@sgi.com>
Cc: Amit Gud <gud@ksu.edu>, Nikita Danilov <nikita@clusterfs.com>,
	David Lang <david.lang@digitalinsight.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	riel@surriel.com, zab@zabbo.net, arjan@infradead.org,
	suparna@in.ibm.com, brandon@ifup.org, karunasagark@gmail.com
Subject: Re: [RFC][PATCH] ChunkFS: fs fission for faster fsck
Date: Wed, 25 Apr 2007 16:03:44 -0700	[thread overview]
Message-ID: <20070425230344.GC16129@nifty> (raw)
In-Reply-To: <20070425105434.GX32602149@melbourne.sgi.com>

On Wed, Apr 25, 2007 at 08:54:34PM +1000, David Chinner wrote:
> On Tue, Apr 24, 2007 at 04:53:11PM -0500, Amit Gud wrote:
> > 
> > The structure looks like this:
> > 
> >  ----------		----------
> > | cnode 0  |---------->| cnode 0  |----------> to another cnode or NULL
> >  ----------		----------
> > | cnode 1  |-----      | cnode 1  |-----
> >  ----------	|	----------	|
> > | cnode 2  |-- |      | cnode 2  |--   |
> >  ----------  | |	----------  |   |
> > | cnode 3  | | |      | cnode 3  | |   |
> >  ----------  | |	----------  |   |
> > 	  |  |  |		 |  |   |
> > 
> > 	   inodes		inodes or NULL
> 
> How do you recover if fsfuzzer takes out a cnode in the chain? The
> chunk is marked clean, but clearly corrupted and needs fixing and
> you don't know what it was pointing at.  Hence you have a pointer to
> a trashed cnode *somewhere* that you need to find and fix, and a
> bunch of orphaned cnodes that nobody points to *somewhere else* in
> the filesystem that you have to find. That's a full scan fsck case,
> isn't?

Excellent question.  This is one of the trickier aspects of chunkfs -
the orphan inode problem (tricky, but solvable).  The problem is what
if you smash/lose/corrupt an inode in one chunk that has a
continuation inode in another chunk?  A back pointer does you no good
if the back pointer is corrupted.

What you do is keep tabs on whether you see damage that looks like
this has occurred - e.g., inode use/free counts wrong, you had to zero
a corrupted inode - and when this happens, you do a scan of all
continuation inodes in chunks that have links to the corrupted chunk.
What you need to make this go fast is (1) a pre-made list of which
chunks have links with which other chunks, (2) a fast way to read all
of the continuation inodes in a chunk (ignoring chunk-local inodes).
This stage is O(fs size) approximately, but it should be quite swift.

> It seems that any sort of damage to the underlying storage (e.g.
> media error, I/O error or user brain explosion) results in the need
> to do a full fsck and hence chunkfs gives you no benefit in this
> case.

I worry about this but so far haven't found something which couldn't
be cut down significantly with just a little extra work.  It might be
helpful to look at an extreme case.

Let's say we're incredibly paranoid.  We could be justified in running
a full fsck on the entire file system in between every single I/O.
After all, something *might* have been silently corrupted.  But this
would be ridiculously slow.  We could instead never check the file
system.  But then we would end up panicking and corrupting the file
system a lot.  So what's a good compromise?

In the chunkfs case, here's my rules of thumb so far:

1. Detection: All metadata has magic numbers and checksums.
2. Scrubbing: Random check of chunks when possible.
3. Repair: When we detect corruption, either by checksum error, file
   system code assertion failure, or hardware tells us we have a bug,
   check the chunk containing the error and any outside-chunk
   information that could be affected by it.

-VAL

  parent reply	other threads:[~2007-04-25 23:03 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-23 11:21 [RFC][PATCH] ChunkFS: fs fission for faster fsck Amit Gud
     [not found] ` <17965.6084 1.900376.524639@gargle.gargle.HOWL>
2007-04-23 16:28 ` Suparna Bhattacharya
2007-04-23 15:25   ` Amit Gud
2007-04-23 16:32   ` Suparna Bhattacharya
2007-04-24 11:44 ` Nikita Danilov
2007-04-24 18:27   ` David Lang
2007-04-24 19:34     ` Nikita Danilov
2007-04-24 19:26       ` David Lang
2007-04-25 11:34         ` Nikita Danilov
2007-04-25 16:39           ` David Lang
2007-04-25 22:47           ` Valerie Henson
2007-04-26 14:14             ` Jeff Dike
2007-04-26 15:53               ` Amit Gud
2007-04-26 16:05                 ` Jeff Dike
2007-04-26 16:56                   ` Amit Gud
2007-04-27  4:58                   ` Valerie Henson
2007-04-27 15:06                     ` Jeff Dike
2007-05-01 17:26                       ` Valerie Henson
2007-04-26 16:11                 ` Alan Cox
2007-04-26 16:44                   ` Amit Gud
2007-04-24 21:53       ` Amit Gud
2007-04-25 10:54         ` David Chinner
2007-04-25 11:38           ` Andreas Dilger
2007-04-25 17:52             ` Amit Gud
2007-04-25 23:06             ` Valerie Henson
2007-04-25 23:03           ` Valerie Henson [this message]
2007-04-26  0:47             ` David Chinner
2007-04-26 22:21               ` Jörn Engel
2007-04-26 22:21                 ` Jörn Engel
2007-04-26  8:47             ` Jan Kara
2007-04-27  5:07               ` Valerie Henson
2007-04-27 10:53                 ` Jörn Engel
2007-04-27 10:53                   ` Jörn Engel
2007-04-28  6:50                   ` Valerie Henson
2007-04-28 10:03                     ` Jörn Engel
2007-04-28 10:03                       ` Jörn Engel
2007-04-25 22:43       ` Valerie Henson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070425230344.GC16129@nifty \
    --to=val_henson@linux.intel.com \
    --cc=arjan@infradead.org \
    --cc=brandon@ifup.org \
    --cc=david.lang@digitalinsight.com \
    --cc=dgc@sgi.com \
    --cc=gud@ksu.edu \
    --cc=karunasagark@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nikita@clusterfs.com \
    --cc=riel@surriel.com \
    --cc=suparna@in.ibm.com \
    --cc=zab@zabbo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.