linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Karuna sagar k" <karunasagark@gmail.com>
To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Testing framework
Date: Thu, 28 Dec 2006 18:48:39 +0530	[thread overview]
Message-ID: <2e4afe1e0612280518s6bccefc5k29211c20a2a38062@mail.gmail.com> (raw)

Hi,

I am working on a testing framework for file systems focusing on
repair and recovery areas. Right now, I have been timing fsck and
trying to determine the effectiveness of fsck. The idea that I have is
below.

In abstract terms, I create a file system (ideal state), corrupt it,
run fsck on it and compare the recovered state with the ideal one. So
the whole process is divided into phases:

1. Prepare Phase - a new file system is created, populated and aged.
This state of the file system is consistent and is considered as ideal
state. This state is to be recorded for later comparisons (during the
comparison phase).

2. Corruption Phase - the file system on the disk is corrupted. We
should be able to reproduce such corruption on different test runs
(probably not very accurately). This way we would be able to compare
the file systems in a better way.

3. Repair Phase - fsck is run to repair and recover the disk to a
consistent state. The time taken by fsck is determined here.

4. Comparison Phase - the current state (recovered state) is compared
(a logical comparison) with the ideal state. The comparison tells what
fsck recovered, and how close are the two states.

Apart from this we would also require a component to record the state
of the file system. For this we construct a tree (which represents the
state of the file system) where each node stores some info on the
files and directories in the file system and the tree structure
records the parent children relationship among the files and
directories.

Currently I am focusing on the corruption and comparison phases:

Corruption Phase:
Focus - the corruption should be reproducible. This gives the control
over the comparison between test runs and file systems. I am assuming
here that different test runs would deal with same files.

I have come across two approaches here:

Approach 1 -
1. Among the files present on the disk, we randomly choose few files.
2. For each such file, we will then find the meta data block info
(inode block).
3. We seek to such blocks and corrupt them (the corruption is done by
randomly writing data to the block or some predetermined info)
4. Such files are noted to reproduce the similar corruption.

The above approach looks at the file system from a very abstract view.
But there may be file system specific data structures on the disk (eg.
group descriptors in case of ext2). These are not handled by this
approach directly.

Approach 2 - We pick up meta data blocks from the disk, and form a
logical disk structure containing just these meta data blocks. The
blocks on the logical disk map onto the physical disk. We pick up
randomly blocks from this and corrupt them.

Comparison Phase:
We determine the logical equivalence between the recovered and ideal
states of the disk i.e. say if a file was lost during corruption and
fsck recovered it and put it in the lost+found directory. We have to
recognize this as logical equivalent and not report that the file is
lost.

Right now I have a basic implementation of the above with random
corruption (using fsfuzzer logic).

Any suggestions?

Thanks,
Karuna

             reply	other threads:[~2006-12-28 13:18 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-28 13:18 Karuna sagar k [this message]
  -- strict thread matches above, loose matches on Subject: below --
2007-04-22 20:46 Testing framework Karuna sagar K
2007-04-23  9:09 ` Kalpak Shah
2007-04-23 11:25   ` Karuna sagar K
2007-04-23 14:04 ` Avishay Traeger
2007-04-23 22:11   ` Ric Wheeler
2007-04-25 11:28   ` Karuna sagar K
2007-04-28  9:35 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2e4afe1e0612280518s6bccefc5k29211c20a2a38062@mail.gmail.com \
    --to=karunasagark@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).