All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Theodore Tso <tytso@mit.edu>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH 3/3] e2fsprogs: Support for large inode migration.
Date: Thu, 26 Jul 2007 17:15:30 +0530	[thread overview]
Message-ID: <46A8895A.5000308@linux.vnet.ibm.com> (raw)
In-Reply-To: <20070725143209.GA23613@thunk.org>



Theodore Tso wrote:
> On Wed, Jul 25, 2007 at 11:06:28AM +0530, Aneesh Kumar K.V wrote:
>> From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>>
>> Add new option -I <inode_size> to tune2fs.
>> This is used to change the inode size. The size
>> need to be multiple of 2 and we don't allow to
>> decrease the inode size.
>>
>> As a part of increasing the inode size we throw
>> away the free inodes in the last block group. If
>> we can't we fail. In such case one can resize the
>> file system and then try to increase the inode size.
> 
> Let me guess, you're testing with a filesystem with two block groups,
> right?  And to date you've tested *only* by doubling the size of the
> inode.
> 


I tested this with multiple ( 1 and 7 ) groups. But yes all the testing was to change
inode size from 128 to 256. 


> What your patch does is is keep the number of inode blocks per block
> group constant, so that the total number of inodes decreases by
> whatever factor the inode size is increasing.  It's a cheap, dirty way
> of doing the resizing, since it avoids needing to either (a) update
> directory entries when inode numbers get renumbered, and (b) need to
> update inodes when blocks need to get relocated in order to make room
> for growing the inode table.
> 


That is correct. What i was looking at was to get the dynamic inode location
first. That should help us to place large inode any where right ?. But i know
that is a long term goal since there is no patches for dynamic inode location.

I will work at increasing the inode table size as a part of increasing the inode
size. 



> The problem with your patch is:
> 
> 	* By shrinking the number of inodes, it can constrain the
>           ability of the filesystem to create new files in the future.
> 

I explained this in the commit log.


> 	* It ruins the inode and block placement algorithms where we
>           try to keep inodes in the same block group as their parent
>           directory, and we try to allocate blocks in the same block
>           group as their containing inode.


I missed this in my analysis. So this means we may end up with bad performance
after resizing the inode. I will look at increasing the inode table size as a
part of increasing the inode size.


> 
> 	* Because when the current patch makes no attempt to relocate
>           inodes, and when it doubles the inode size, it chops the
>           number of inodes in half, there must be no inodes in the
>           last half of the inode table.  That is if there are N block
>           groups, the inode tables in blockgroups N/2 to N-1 must be
>           empty.  But because of the block group spreading algorithm,
>           where new directories get pushed out to new block groups, in
>           any real real-life filesystem, the use of block groups is
>           evenly spread out, which means in practice you won't see
>           case where the last half of the inodes will not be in use.
>           Hence, your patch won't actually work in practice.
> 
> So unfortunately, the right answer *will* require expanding the inode
> tables, and potentially moving blocks out of the way in order to make
> room for it.  A lot of that machinery is in resize2fs, actually, and
> I'm wondering if the right answer is to move resize2fs's functionality
> into tune2fs.  We will also need this to be able to add the resize
> inode after the fact.
> 
> That's not going to be a trivial set of changes; if you're looking for
> something to test the undo manager, my suggestion would be to wire it
> up into mke2fs and/or e2fsck first.  Mke2fs might be nice since it
> will give us a recovery path in case someone screws up the arguments
> to mkfs.  
> 

I guess Undo I/O manager can go in because I have been using it for
the ext3 -> ext4 inode migration testing and for testing the above patch.


Why would one need to recover on mkfs. He can very well run mkfs again right ?



>> tune2fs use undo I/O manager when migrating to large
>> inode. This helps in reverting the changes if end results
>> are not correct.The environment variable TUNE2FS_SCRATCH_DIR
>> is used to indicate the  directory within which the tdb
>> file need to be created. The file will be named tune2fs-XXXXXX
> 
> My suggestion would be to use something like /var/lib/e2fsprogs as the
> defalut directory.  And we should also do some tests to make sure
> something sane happens if we run out of room for the undo file.
> Presumably the only thing we can do is to abort the run and then back
> out the chnages using what was written out to the undo file.
> 

I had a FIXME!! in the code which stated it would be nice to use the  conf file
But right now the conffile is e2fsck specific

+	char *tdb_dir, tdb_file[PATH_MAX];
+#if 0 /* FIXME!! */
+	/*
+	 * Configuration via a conf file would be
+	 * nice
+	 */
+	profile_get_string(profile, "scratch_files",
+					"directory", 0, 0,
+					&tdb_dir);
+#endif
+	tdb_dir = getenv("TUNE2FS_SCRATCH_DIR");


-aneesh

  parent reply	other threads:[~2007-07-26 11:46 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-25  5:36 e2fsprogs: Undo I/O Manager and large inode migration support in tune2fs Aneesh Kumar K.V
2007-07-25  5:36 ` [PATCH 1/3] e2fsprogs: Add undo I/O manager Aneesh Kumar K.V
2007-07-25  5:36   ` Aneesh Kumar K.V
2007-07-25  5:36   ` [PATCH 2/3] e2fsprogs: Add undoe2fs Aneesh Kumar K.V
2007-07-25  5:36     ` Aneesh Kumar K.V
2007-07-25  5:36   ` [PATCH 3/3] e2fsprogs: Support for large inode migration Aneesh Kumar K.V
2007-07-25  5:36     ` Aneesh Kumar K.V
2007-07-25 14:32     ` Theodore Tso
2007-07-25 19:46       ` Andreas Dilger
2007-07-26 14:58         ` Theodore Tso
2007-07-26 11:45       ` Aneesh Kumar K.V [this message]
2007-07-26 16:13         ` Theodore Tso
2007-07-27  2:59           ` Aneesh Kumar K.V
2007-07-27 15:34             ` Theodore Tso
2007-07-27 19:03               ` Aneesh Kumar K.V
2007-07-25 19:49     ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46A8895A.5000308@linux.vnet.ibm.com \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.