Re: [PATCH] e2fsprogs: New bitmap and inode table allocation for FLEX_BG v2

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Jose R. Santos" <jrs@us.ibm.com>
To: Theodore Tso <tytso@MIT.EDU>
Cc: linux-ext4 <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH] e2fsprogs: New bitmap and inode table allocation for FLEX_BG v2
Date: Mon, 24 Mar 2008 10:00:05 -0500	[thread overview]
Message-ID: <20080324100005.50f9bb18@gara> (raw)
In-Reply-To: <20080324134650.GA13621@mit.edu>

On Mon, 24 Mar 2008 09:46:50 -0400
Theodore Tso <tytso@MIT.EDU> wrote:

> I'm starting to audit this patch, and have a bunch of questions and
> observations.
> 
> On Wed, Feb 13, 2008 at 08:47:50PM -0600, Jose R. Santos wrote:
> > +void ext2fs_bgd_set_flex_meta_flag(ext2_filsys fs, blk_t block)
> > +{
> > +	dgrp_t	group;
> > +
> > +	group = ext2fs_group_of_blk(fs, block);
> > +	if (!(fs->group_desc[group].bg_flags & EXT2_BG_FLEX_METADATA))
> > +		fs->group_desc[group].bg_flags |= EXT2_BG_FLEX_METADATA;
> > +}
> 
> This function is used nowhere else but in lib/ext2fs/alloc_tables.c,
> and it's not declared in lib/ext2fs/ext2fs.h.  So I've renamed it to
> bgd_set_flex_meta_flag() and declared it static, to make it clear that
> it's a private function.

Yes, this should be rename static as it is only intended to be used in
alloc_tables.c.  Somehow my brain did not register that functions with
the ext2fs_ prefix implies being a API accessible routine.

> The other question which immediately comes to mind is *why* we need to
> set this flag in the first place.  The kernel doesn't use it, and
> there doesn't seem to be any reason why needs to be an on-disk flag at
> all.  It seems to be used as a way of communicating to mke2fs about
> whether or not we can safely set the EXT2_BG_BLOCK_UNINIT flag.  
> 
> This turns out to be a kludge whose short comings show other problems.
> The real problem is that most of the libext2fs isn't BLOCK_UNINIT
> aware.  So for example, if debugfs is used to write a file into the
> filesystem, and the block group doesn't have an initialized bitmap,
> the Wrong Thing will happen.  More to the point, if you use mke2fs to
> a 1k blocksize filesystem, and the journal is bigger than 16 megs, (or
> with a 4k blocksize filesystem, if the journal is bigger than 512
> megs), you could easily end up allocating the journal into a block
> group with BG_BLOCK_UNINIT.  Oops.

There are two reasons for adding the flag.  First to improve fsck
performance by not having to check all the bgd each time we need to set
BLOCK_UNINIT.  Since we did not define a limited range of were
meta-data could be allocated for a particular block group, not having
the flag could be very expensive on a very large fs.  The second reason
is that having a flag makes it possible to have the BLOCK_UNINIT flag
set on block groups with meta-data without taking a big impact when
initializing those block groups that dont have meta-data(currently
unimplemented in the kernel).  The kludge that we use to avoid
inaccurate free block counts in the kernel was to initialized all block
groups which contain meta-data.  The flag allow us to very quickly skip
block groups which do not contain meta-data and do a more thorough
search for those that do.

> This wasn't that much of a big deal since up until now lazy_bg was
> only used for debugging really big filesystems, and not much else.  It
> was a quick hack for debugging purposes only.  But given that
> uninititalized blockgroups are intended for more general use, we have
> to make sure all of these corner cases are handled correctly.
> 
> Just looking at it quickly, it seems like the right thing to do is
> split setup_lazy_bg() into two parts.  The first part sets
> EXT2_BG_BLOCK_UNINIT for all block groups, and then we modify the
> block allocation functions in lib/ext2fs to clear the BLOCK_UNINIT
> flag --- and then later on, we update the bg_free_blocks_count and
> s_free_blocks_count for the lazy_bg case.  
> 
> This needs more study though, and there is a similar issue, although
> not quite so serious about making sure all of libext2fs is
> INODE_UNINIT aware.
> 
> > +/*
> > + * This routine searches for free blocks that can allocate a full
> > + * group of bitmaps or inode tables for a flexbg group.  Returns the
> > + * block number with a correct offset were the bitmaps and inode
> > + * tables can be allocated continously and in order.
> > + */
> > +blk_t ext2fs_flexbg_offset(ext2_filsys fs, dgrp_t group, blk_t start_blk,
> > +			   ext2fs_block_bitmap bmap, int offset, int size)
> 
> See above comments about no one using this feature but
> lib/ext2fs/alloc_tables.c.  Is there reason why this function isn't
> declared static?  (And if it is renamed static, better to remove the
> ext2fs_ prefix, to make it clear it isn't a globally visible ext2fs
> library function.)

Ditto for this one too.

> 
> > diff --git a/lib/ext2fs/closefs.c b/lib/ext2fs/closefs.c
> > index a523c8e..83d7cc4 100644
> > --- a/lib/ext2fs/closefs.c
> > +++ b/lib/ext2fs/closefs.c
> > @@ -56,6 +56,7 @@ int ext2fs_super_and_bgd_loc(ext2_filsys fs,
> >  	unsigned int meta_bg, meta_bg_size;
> >  	blk_t	numblocks, old_desc_blocks;
> >  	int	has_super;
> > +	unsigned int flex_bg_size = 1 << fs->super->s_log_groups_per_flex;
> >  
> >  	group_block = ext2fs_group_first_block(fs, group);
> >  
> 
> The function doesn't use this new variable; so it should be just
> deleted and removed.
> 
> 						- Ted



-JRS

next prev parent reply	other threads:[~2008-03-24 15:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-14  2:47 [PATCH] e2fsprogs: New bitmap and inode table allocation for FLEX_BG v2 Jose R. Santos
2008-03-24 13:46 ` Theodore Tso
2008-03-24 15:00   ` Jose R. Santos [this message]
2008-03-25 22:12   ` Jose R. Santos
2008-03-25 22:24     ` Theodore Tso
2008-03-31 16:41   ` Jose R. Santos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080324100005.50f9bb18@gara \
    --to=jrs@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@MIT.EDU \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.