linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Theodore Tso <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH 3/3] e2fsprogs: Support for large inode migration.
Date: Thu, 26 Jul 2007 17:15:30 +0530	[thread overview]
Message-ID: <46A8895A.5000308@linux.vnet.ibm.com> (raw)
In-Reply-To: <20070725143209.GA23613@thunk.org>



Theodore Tso wrote:
> On Wed, Jul 25, 2007 at 11:06:28AM +0530, Aneesh Kumar K.V wrote:
>> From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>>
>> Add new option -I <inode_size> to tune2fs.
>> This is used to change the inode size. The size
>> need to be multiple of 2 and we don't allow to
>> decrease the inode size.
>>
>> As a part of increasing the inode size we throw
>> away the free inodes in the last block group. If
>> we can't we fail. In such case one can resize the
>> file system and then try to increase the inode size.
> 
> Let me guess, you're testing with a filesystem with two block groups,
> right?  And to date you've tested *only* by doubling the size of the
> inode.
> 


I tested this with multiple ( 1 and 7 ) groups. But yes all the testing was to change
inode size from 128 to 256. 


> What your patch does is is keep the number of inode blocks per block
> group constant, so that the total number of inodes decreases by
> whatever factor the inode size is increasing.  It's a cheap, dirty way
> of doing the resizing, since it avoids needing to either (a) update
> directory entries when inode numbers get renumbered, and (b) need to
> update inodes when blocks need to get relocated in order to make room
> for growing the inode table.
> 


That is correct. What i was looking at was to get the dynamic inode location
first. That should help us to place large inode any where right ?. But i know
that is a long term goal since there is no patches for dynamic inode location.

I will work at increasing the inode table size as a part of increasing the inode
size. 



> The problem with your patch is:
> 
> 	* By shrinking the number of inodes, it can constrain the
>           ability of the filesystem to create new files in the future.
> 

I explained this in the commit log.


> 	* It ruins the inode and block placement algorithms where we
>           try to keep inodes in the same block group as their parent
>           directory, and we try to allocate blocks in the same block
>           group as their containing inode.


I missed this in my analysis. So this means we may end up with bad performance
after resizing the inode. I will look at increasing the inode table size as a
part of increasing the inode size.


> 
> 	* Because when the current patch makes no attempt to relocate
>           inodes, and when it doubles the inode size, it chops the
>           number of inodes in half, there must be no inodes in the
>           last half of the inode table.  That is if there are N block
>           groups, the inode tables in blockgroups N/2 to N-1 must be
>           empty.  But because of the block group spreading algorithm,
>           where new directories get pushed out to new block groups, in
>           any real real-life filesystem, the use of block groups is
>           evenly spread out, which means in practice you won't see
>           case where the last half of the inodes will not be in use.
>           Hence, your patch won't actually work in practice.
> 
> So unfortunately, the right answer *will* require expanding the inode
> tables, and potentially moving blocks out of the way in order to make
> room for it.  A lot of that machinery is in resize2fs, actually, and
> I'm wondering if the right answer is to move resize2fs's functionality
> into tune2fs.  We will also need this to be able to add the resize
> inode after the fact.
> 
> That's not going to be a trivial set of changes; if you're looking for
> something to test the undo manager, my suggestion would be to wire it
> up into mke2fs and/or e2fsck first.  Mke2fs might be nice since it
> will give us a recovery path in case someone screws up the arguments
> to mkfs.  
> 

I guess Undo I/O manager can go in because I have been using it for
the ext3 -> ext4 inode migration testing and for testing the above patch.


Why would one need to recover on mkfs. He can very well run mkfs again right ?



>> tune2fs use undo I/O manager when migrating to large
>> inode. This helps in reverting the changes if end results
>> are not correct.The environment variable TUNE2FS_SCRATCH_DIR
>> is used to indicate the  directory within which the tdb
>> file need to be created. The file will be named tune2fs-XXXXXX
> 
> My suggestion would be to use something like /var/lib/e2fsprogs as the
> defalut directory.  And we should also do some tests to make sure
> something sane happens if we run out of room for the undo file.
> Presumably the only thing we can do is to abort the run and then back
> out the chnages using what was written out to the undo file.
> 

I had a FIXME!! in the code which stated it would be nice to use the  conf file
But right now the conffile is e2fsck specific

+	char *tdb_dir, tdb_file[PATH_MAX];
+#if 0 /* FIXME!! */
+	/*
+	 * Configuration via a conf file would be
+	 * nice
+	 */
+	profile_get_string(profile, "scratch_files",
+					"directory", 0, 0,
+					&tdb_dir);
+#endif
+	tdb_dir = getenv("TUNE2FS_SCRATCH_DIR");


-aneesh

  parent reply	other threads:[~2007-07-26 11:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-25  5:36 e2fsprogs: Undo I/O Manager and large inode migration support in tune2fs Aneesh Kumar K.V
     [not found] ` <bee58d48110eee4d5cd133167245b99644148d96.1185341470.git.aneesh.kumar@linux.vnet.ibm.com>
2007-07-25  5:36   ` [PATCH 1/3] e2fsprogs: Add undo I/O manager Aneesh Kumar K.V
     [not found]   ` <47f96570519d76b8d59f92b729a0a48c4a1b68d8.1185341470.git.aneesh.kumar@linux.vnet.ibm.com>
2007-07-25  5:36     ` [PATCH 2/3] e2fsprogs: Add undoe2fs Aneesh Kumar K.V
     [not found]   ` <3ae4c55b831a13f9fbb9a187efcd65d29434bf09.1185341470.git.aneesh.kumar@linux.vnet.ibm.com>
2007-07-25  5:36     ` [PATCH 3/3] e2fsprogs: Support for large inode migration Aneesh Kumar K.V
2007-07-25 14:32     ` Theodore Tso
2007-07-25 19:46       ` Andreas Dilger
2007-07-26 14:58         ` Theodore Tso
2007-07-26 11:45       ` Aneesh Kumar K.V [this message]
2007-07-26 16:13         ` Theodore Tso
2007-07-27  2:59           ` Aneesh Kumar K.V
2007-07-27 15:34             ` Theodore Tso
2007-07-27 19:03               ` Aneesh Kumar K.V
2007-07-25 19:49     ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46A8895A.5000308@linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).