linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Karuna sagar K" <karunasagark@gmail.com>
To: "Amit Gud" <gud@ksu.edu>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: ChunkFS - measuring cross-chunk references
Date: Mon, 23 Apr 2007 12:49:58 +0530	[thread overview]
Message-ID: <2e4afe1e0704230019h3853bc82uae6c9b5cdc325ab9@mail.gmail.com> (raw)
In-Reply-To: <462B8D0C.7020509@ksu.edu>

Hi,

The tool estimates the cross-chunk references from an extt2/3 file
system. It considers a block group as one chunk and calcuates how many
block groups does a file span across. So, the block group size gives
the estimate of chunk size.

The file systems were aged for about 3-4 months on a developers laptop.

Should have given the background before. Below is the explanations for
the tool. Valh and others came up with this idea.

-------------------------
Chunkfs will only work if we have "few" cross-chunk references.  We
can estimate the effect of chunk size on the number of these
references using an existing ext2/3 file system and treating the block
groups as though they are chunks.  The basic idea is that we figure
out what the block group boundaries are and then find out which files
and directories span two or more block groups.

Step 1:
-------

Get a real-world ext2/3 file system. A file system which has been in
use is required. One from a laptop or a server of any sort will do
fine.

Step 2:
-------

Figure out where the block group boundaries are on disk. Two things
are to be known:

1. Which inode numbers are in which block group?
2. Which blocks are in which block group?

At the end of this step we should have a list that looks something like:

Block group 1: Inodes 11-343, blocks 1000-20000
Block group 2: Inodes 344-576, blocks 20000-40000
[...]

Step 3:
-------

For each file, get the inode number and use mapping from step 2 to
figure out which block group it is in.  Now use bmap() on each block
in the file, and find out the block number.  Use mapping from step 2
to figure out which block groups it has data in. For each file, record
the list of all block groups.

For each directory, get the inode number and map that to a block
group. Then get the inode numbers of all entries in the directory
(ignore symlinks) and map them to a block group.  For each directory,
record the list of all block groups.

Step 4:
-------

Count the number of cross-chunk references this file system would
need.  This is done by going through each directory and file, and
adding up the number of block groups it uses MINUS one.  So if a file
was in block groups 3, 7, and 24, then you would add 2 to the total
number of cross-chunk references.  If a file was only in block group
2, then you would add 0 to the total.


On 4/22/07, Amit Gud <gud@ksu.edu> wrote:
> Karuna sagar K wrote:
> > Hi,
> >
> > The attached code contains program to estimate the cross-chunk
> > references for ChunkFS file system (idea from Valh). Below are the
> > results:
> >
>
> Nice to see some numbers! But would be really nice to know:
>
> - what the chunk size is
> - how the files were created or, more vaguely, how 'aged' the fs is
> - what is the chunk allocation algorithm
>
>
> Best,
> AG
> --
> May the source be with you.
> http://www.cis.ksu.edu/~gud
>
>


Thanks,
Karuna

  reply	other threads:[~2007-04-23  7:19 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-22 20:35 ChunkFS - measuring cross-chunk references Karuna sagar K
2007-04-22 16:27 ` Amit Gud
2007-04-23  7:19   ` Karuna sagar K [this message]
2007-04-23  9:34     ` Kalpak Shah
2007-04-23 20:53       ` Andreas Dilger
2007-04-24  0:13         ` Theodore Tso
2007-04-24  1:02           ` Arjan van de Ven
2007-04-24  1:24             ` Amit Gud
2007-04-24  1:38               ` Amit Gud
2007-04-24 15:07             ` Theodore Tso
2007-04-25  0:17           ` Karuna sagar K
2007-04-25  0:20             ` Karuna sagar K
2007-04-25 10:23               ` Suparna Bhattacharya
2007-05-19 22:49                 ` Karuna sagar K
2007-05-07  5:18           ` Valerie Henson
2007-05-07  1:09         ` Valerie Henson
2007-05-07  1:09 ` Valerie Henson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2e4afe1e0704230019h3853bc82uae6c9b5cdc325ab9@mail.gmail.com \
    --to=karunasagark@gmail.com \
    --cc=gud@ksu.edu \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).