public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: Nathan Scott <nscott@aconex.com>
Cc: Eric Sandeen <sandeen@sandeen.net>, Chris Wedgwood <cw@f00f.org>,
	Michael Nishimoto <miken@agami.com>,
	xfs@oss.sgi.com
Subject: Re: Allocating inodes from a single block
Date: Wed, 18 Jul 2007 13:50:12 +1000	[thread overview]
Message-ID: <20070718035012.GA12413810@sgi.com> (raw)
In-Reply-To: <1184724090.15488.553.camel@edge.yarra.acx>

On Wed, Jul 18, 2007 at 12:01:30PM +1000, Nathan Scott wrote:
> On Tue, 2007-07-17 at 20:43 -0500, Eric Sandeen wrote:
> > Chris Wedgwood wrote:
> > > On Tue, Jul 17, 2007 at 11:11:50AM -0700, Michael Nishimoto wrote:
> > > 
> > >> Filesystem free space becomes fragmented over time.  It's possible
> > >> for total free space to be a decent size and still not have a chunk
> > >> large enough to allocate new inodes.
> > > 
> > > by default there is a restriction that indoes shouldn't consume more
> > > that 25% of the total space
> > > 
> > > see the mkfs.xfs man-page for details, search for 'maxpct'
> > > 
> > > for existing filesystems you can use xfs_db to rewrite this value
> 
> FWIW, xfs_growfs can be used to change this online.
> 
> > The problem is that inodes are allocated in "clusters" of blocks.
> > 
> > If your free blocks aren't such that they can form a cluster, I think
> > you're out of luck when trying to allocate new inodes if your existing
> > clusters are full.
> 
> Have you looked into this much Mike?  I've not recently, but from a
> quick peek it looks like the cluster size is set in xfs_mount.c as
> mp->m_inode_cluster_size and a different value is used depending on
> the machines memory size ... so, perhaps this can be made a mount
> option?  (XFS_INODE_SMALL_CLUSTER_SIZE is 1FSB AFAICT).  But, maybe
> I'm missing something or not remembering some details here that'd
> make that infeasible.

The issue here is not the cluster size - that is purely an in-memory
arrangement for reading/writing muliple inodes at once. The issue
here is inode *chunks* (as Eric pointed out).

Basically, each record in the AGI btree has a 64 bit but-field for
indicating whether the inodes in the chunk are used or free and a
64bit address of the first block of the inode chunk.

It is assumed that all the inodes in the chunk are contiguous as
they are addressed in a compressed form - AG #, block # of first inode,
inode number in chunk.

That means that:

	a) the inode size across the entire AG must be fixed
	b) the inodes must be allocated in contiguous chunks of
	   64 inodes regardless of their size

To change this, you need to completely change the AGI format, the
inode allocation code and the inode freeing code and all the code that
assumes that inodes appear in 64 inode chunks e.g. bulkstat. Then
repair, xfs_db, mkfs, check, etc....

The best you can do to try to avoid these sorts of problems is
use the "ikeep" option to keep empty inode chunks around. That way
if you remove a bunch of files then fragement free space you'll
still be able to create new files until you run out of pre-allocated
inodes....

> Even better than a mount option would be to degrade to smaller size
> dynamically... not sure how hard that'd be either ... probably lots
> of corner cases lurking there.

And a major on-disk format change.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

  reply	other threads:[~2007-07-18  3:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-17 18:11 Allocating inodes from a single block Michael Nishimoto
2007-07-17 20:19 ` Chris Wedgwood
2007-07-17 21:01   ` Michael Nishimoto
2007-07-18  1:43   ` Eric Sandeen
2007-07-18  2:01     ` Nathan Scott
2007-07-18  3:50       ` David Chinner [this message]
2007-07-18 17:53         ` Michael Nishimoto
2007-07-18 19:10         ` Mike Montour
2007-07-19  2:30           ` David Chinner
2007-07-20  1:26             ` Mike Montour
     [not found] <200707231240.23425.david@fromorbit.com>
2007-07-23  5:06 ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070718035012.GA12413810@sgi.com \
    --to=dgc@sgi.com \
    --cc=cw@f00f.org \
    --cc=miken@agami.com \
    --cc=nscott@aconex.com \
    --cc=sandeen@sandeen.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox