linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@clusterfs.com>
To: Theodore Tso <tytso@mit.edu>
Cc: "Jose R. Santos" <jrs@us.ibm.com>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: block groups with no inode tables
Date: Tue, 10 Jul 2007 22:31:18 -0600	[thread overview]
Message-ID: <20070711043118.GJ6417@schatzie.adilger.int> (raw)
In-Reply-To: <20070710203050.GH27033@thunk.org>

On Jul 10, 2007  16:30 -0400, Theodore Tso wrote:
> On Tue, Jul 10, 2007 at 12:12:21PM -0500, Jose R. Santos wrote:
> > As I play with the allocation of the metadata for the FLEX_BG feature,
> > it seems that we could benefit from having block groups with no inode
> > tables.  Right now we allocate one inode table per bg base on the
> > inode_blocks_per_group.  For FLEX_BG though, it would make more sense
> > to have a larger inode tables that fully use the inode bitmap allocated
> > on the first few block groups.  Once we reach the number of inode per
> > FLEX_BG, then the remaining block groups could then have no inode
> > tables defined.
> > 
> > The idea here is that we better utilize the inode bitmaps and reduce the
> > number of inode tables to improve mkfs/fsck times. We could also
> > support expansion of inode since we have block groups that have empty
> > entries in the block group descriptors and as long as we can find
> > enough empty blocks for the inode table expanding the number of inodes
> > should be relatively easy.
> > 
> > Don't know if ext4 currently supports this.  Any thoughts?
> 
> Plans to support are there; Andreas sent a patch back in April to
> implement this, using bg_itable_unused, which is already reserved in
> the block group data structure.  The idea here is to speed up fsck by
> specifying how many inodes are actually in use in the block group, so
> we don't have to initialize them until they are to be used.  This is
> tied with the checksum patches, since doing this means we need to
> really worry about the accuracy of the block group descriptors or we
> could lose a lot of data if the block group descriptors are corrupted.

I think Jose means something slightly different, but in the end the
uninit_groups feature (patches in the patch queue, but disabled for
some reason) essentially implements this.  We don't need to read inode
bitmaps from disk if the INODE_UNINIT flag is in the group.

I think all that is needed to get the semantics Jose wants is to tune
the inode allocation in ext4_new_inode() to avoid inode bitmaps that
are not yet initialized.  I suppose the other incremental feature would
be to allow the blocks in the inode table become used for file allocation,
but this exposes us to potential malicious corruption in some cases if
users create "inode looking" data files (e.g. suid root inodes) on a full
filesystem and e2fsck is convinced to treat them as inodes.

We might instead limit this space to directories and indirect/index
blocks, which wouldn't be a bad idea but when we get to changing the
inode structures too much I'd like to combine several of the other
changes.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

      reply	other threads:[~2007-07-11  4:31 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-10 17:12 block groups with no inode tables Jose R. Santos
2007-07-10 17:30 ` coly li
2007-07-10 17:40   ` Dave Kleikamp
2007-07-10 15:59     ` Mingming Cao
2007-07-10 19:09       ` Dave Kleikamp
2007-07-11  4:50         ` Andreas Dilger
2007-07-10 20:30 ` Theodore Tso
2007-07-11  4:31   ` Andreas Dilger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070711043118.GJ6417@schatzie.adilger.int \
    --to=adilger@clusterfs.com \
    --cc=jrs@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).