linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kalpak Shah <kalpak@linsyssoft.com>
To: Karuna sagar K <karunasagark@gmail.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Testing framework
Date: Mon, 23 Apr 2007 14:39:39 +0530	[thread overview]
Message-ID: <1177319379.4579.13.camel@garfield> (raw)
In-Reply-To: <2e4afe1e0704221346u6d6baec1uab88dc273ff08de9@mail.gmail.com>

On Mon, 2007-04-23 at 02:16 +0530, Karuna sagar K wrote:
> Hi,
> 
> For some time I had been working on this file system test framework.
> Now I have a implementation for the same and below is the explanation.
> Any comments are welcome.
> 
> Introduction:
> The testing tools and benchmarks available around do not take into
> account the repair and recovery aspects of file systems. The test
> framework described here focuses on repair and recovery capabilities
> of file systems. Since most file systems use 'fsck' to recover from
> file system inconsistencies, the test framework characterizes file
> systems based on outcomes of running 'fsck'.

<snip>

> Higher level perspective/approach:
> In this approach the file system is viewed as a tree of nodes, where
> nodes are either files or directories. The metadata information
> corresponding to some randomly chosen nodes of the tree are corrupted.
> Nodes which are corrupted are marked or recorded to be able to replay
> later. This file system is called source file system while the file
> system on which we need to replay the corruption is called target file
> system. The assumption is that the target file system contains a set
> of files and directories which is a superset of that in the source
> file system. Hence to replay the corruption we need point out which
> nodes in the source file system were corrupted in the source file
> system and corrupt the corresponding nodes in the target file system.
> 
> A major disadvantage with this approach is that on-disk structures
> (like superblocks, block group descriptors, etc.) are not considered
> for corruption.
> 
> Lower level perspective/approach:
> The file system is looked upon as a set of blocks (more precisely
> metadata blocks). We randomly choose from this set of blocks to
> corrupt. Hence we would be able to overcome the deficiency of the
> previous approach. However this approach makes it difficult to have a
> replayable corruption. Further thought about this approach has to be
> given.
> 

Fill a test filesystem with data and save it. Corrupt it by copying a
chunk of data from random locations A to B. Save positions A and B so
that you can reproduce the corruption. 

Or corrupt random bits (ideally in metadata blocks) and maintain the
list of the bit numbers for reproducing the corruption.

> We could have a blend of both the approaches in the program to
> compromise between corruption and replayability.
> 
> Repair Phase:
> The corrupted file system is repaired and recovered with 'fsck' or any
> other tools; this phase considers the repair and recovery action on
> the file system as a black box. The time taken to repair by the tool
> is measured.

I see that you are running fsck just once on the test filesystem. It
might be a good idea to run it twice and if second fsck does not find
the filesystem to be completely clean that means it is a bug in fsck.

<snip>

> Summary Phase:
> This is the final phase in the model. A report file is prepared which
> summarizes the result of this test run. The summary contains:
> 
> Average time taken for recovery
> Number of files lost at the end of each iteration
> Number of files with metadata corruption at the end of each iteration
> Number of files with data corruption at the end of each iteration
> Number of files lost and found at the end of each iteration
> 
> Putting it all together:
> The Corruption, Repair and Comparison phases could be repeated a
> number of times (each repetition is called an iteration) before the
> summary of that test run is prepared.
> 
> TODO:
> Account for files in the lost+found directory during the comparison phase.
> Support for other file systems (only ext2 is supported currently)
> State of the either file system is stored, which may be huge, time
> consuming and not necessary. So, we could have better ways of storing
> the state.

Also, people may want to test with different mount options, so something
like "mount -t $fstype -o loop,$MOUNT_OPTIONS $imgname $mountpt" may be
useful. Similarly it may also be useful to have MKFS_OPTIONS while
formatting the filesystem.

Thanks,
Kalpak.

> 
> Comments are welcome!!
> 
> Thanks,
> Karuna


  reply	other threads:[~2007-04-23  9:06 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-22 20:46 Testing framework Karuna sagar K
2007-04-23  9:09 ` Kalpak Shah [this message]
2007-04-23 11:25   ` Karuna sagar K
2007-04-23 14:04 ` Avishay Traeger
2007-04-23 22:11   ` Ric Wheeler
2007-04-25 11:28   ` Karuna sagar K
2007-04-28  9:35 ` Pavel Machek
  -- strict thread matches above, loose matches on Subject: below --
2006-12-28 13:18 Karuna sagar k

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1177319379.4579.13.camel@garfield \
    --to=kalpak@linsyssoft.com \
    --cc=karunasagark@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).