public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: David Chinner <dgc@sgi.com>, xfs-dev <xfs-dev@sgi.com>,
	xfs-oss <xfs@oss.sgi.com>
Subject: Re: [Patch] unique per-AG inode generation number initialisation
Date: Tue, 8 Apr 2008 07:52:03 +1000	[thread overview]
Message-ID: <20080407215203.GB108924158@sgi.com> (raw)
In-Reply-To: <20080407125738.GD27350@infradead.org>

On Mon, Apr 07, 2008 at 08:57:38AM -0400, Christoph Hellwig wrote:
> I don't really like this.  The chance to hit a previously used generation
> seems to high.

The chance to hit an existing generation number is almost non-existant.

The counter is incremented on every allocation and not just when
inode chunks are allocated on disk. Hence a series of "allocate
chunk, unlink + free chunk, realloc chunk" is guaranteed to get a
higher generation number on reallocation, as is the "allocate a
chunk, while [1] {allocate; unlink}, unlink chunk, reallocate
chunk." These are the issues that are causing use problems right
now.

The generation number won't get reused at all until it wraps at 2^32
allocations within the AG, and then you've got to have a chunk of inodes
get freed and reallocated at the same time the counter matches an inode
generation number. While not impossible, it'll be pretty rare....

> What about making the first few bits of each generation
> number a per-ag counter that's incremented anytime we deallocate an inode
> cluster?

First thing I considered - increment on chunk freeing is not
sufficient guarantee of short-term uniqueness. To guarantee short
term uniqueness, the generation number used to initialise the inode
chunk if it is immediately reallocated needs to be greater than the
maximum used by any inode in the chunk that got freed. Now the "counter"
becomes a "maximum generation number used in the AG" value. This
also adds significant complexity to xfs_icluster_free() as we have to
look at every inode in the chunk and not just the ones that are
in-core.

FWIW, the biggest complexity with this approach is wrapping - how do
you tell what the highest highest generation number in the inode
chunk being freed is when some have wrapped through zero?

I basically gave up on this approach because of the extra complexity
and nasty, untestable corner cases it introduced into code that is
already complex. A simple incrementing counter solves the short-term
uniqueness problem while still making it very hard to get duplicates in
the long term. If you really, really need long term uniqueness, then
use 'ikeep'.

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

  reply	other threads:[~2008-04-07 21:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-01 23:18 [Patch] unique per-AG inode generation number initialisation David Chinner
2008-04-02  4:02 ` Niv Sardi
2008-04-04  9:08 ` Hans-Peter Jansen
2008-04-07 12:57 ` Christoph Hellwig
2008-04-07 21:52   ` David Chinner [this message]
2008-04-10  4:34     ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080407215203.GB108924158@sgi.com \
    --to=dgc@sgi.com \
    --cc=hch@infradead.org \
    --cc=xfs-dev@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox