public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <npiggin@suse.de>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Andi Kleen <andi@firstfloor.org>,
	Dave Chinner <david@fromorbit.com>,
	Alexander Viro <viro@ftp.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Christoph Lameter <clameter@sgi.com>,
	Rik van Riel <riel@redhat.com>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	akpm@linux-foundation.org, Miklos Szeredi <miklos@szeredi.hu>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Hugh Dickins <hugh@veritas.com>,
	linux-kernel@vger.kernel.org
Subject: Re: dentries: dentry defragmentation
Date: Mon, 1 Feb 2010 18:08:35 +1100	[thread overview]
Message-ID: <20100201070835.GE9085@laptop> (raw)
In-Reply-To: <20100129220044.GA31305@ZenIV.linux.org.uk>

On Fri, Jan 29, 2010 at 10:00:44PM +0000, Al Viro wrote:
> On Fri, Jan 29, 2010 at 02:49:48PM -0600, Christoph Lameter wrote:
> > +		if ((d_unhashed(dentry) && list_empty(&dentry->d_lru)) ||
> > +		   (!d_unhashed(dentry) && hlist_unhashed(&dentry->d_hash)) ||
> > +		   (dentry->d_inode &&
> > +		   !mapping_cap_writeback_dirty(dentry->d_inode->i_mapping)))
> > +			/* Ignore this dentry */
> > +			v[i] = NULL;
> > +		else
> > +			/* dget_locked will remove the dentry from the LRU */
> > +			dget_locked(dentry);
> > +	}
> > +	spin_unlock(&dcache_lock);
> > +	return NULL;
> > +}
> 
> No.  As the matter of fact - fuck, no.  For one thing, it's going to race
> with umount.  For another, kicking busy dentry out of hash is worse than
> useless - you are just asking to get more and more copies of that sucker
> in dcache.  This is fundamentally bogus, especially since there is a 100%
> safe time for killing dentry - when dput() drives the refcount to 0 and
> you *are* doing dput() on the references you've acquired.  If anything, I'd
> suggest setting a flag that would trigger immediate freeing on the final
> dput().
> 
> And that does not cover the umount races.  You *can't* go around grabbing
> dentries without making sure that superblock won't be shut down under
> you.  And no, I don't know how to deal with that cleanly - simply bumping
> superblock ->s_count under sb_lock is enough to make sure it's not freed
> under you, but what you want is more than that.  An active reference would
> be enough, except that you'd get sudden "oh, sorry, now there's no way
> to make sure that superblock is shut down at umount(2), no matter what kind
> of setup you have".  So you really need to get ->s_umount held shared,
> which is, not particulary locking-order-friendly, to put it mildly.

I always preferred to do defrag in the opposite way. Ie. query the
slab allocator from existing shrinkers rather than opposite way
around. This lets you reuse more of the locking and refcounting etc.

So you have a pin on the object somehow via the normal shrinker path,
and therefore you get a pin on the underlying slab. I would just like
to see even performance of a real simple approach that just asks
whether we are in this slab defrag mode, and if so, whether the slab
is very sparse. If yes, then reclaim aggressively.

If that doesn't perform well enough and you have to go further and
discover objects on the same slab, then it does get a bit more
tricky because:
- you need the pin on the first object in order to discover more
- discovered objects may not be expected in the existing shrinker
  code that just picks objects off LRUs

However your code already has to handle the 2nd case anyway, and for
the 1st case it is probably not too hard to do with dcache/icache. And
in either case you seem to avoid the worst of the sleeping and lock
ordering and slab inversion problems of your ->get approach.

But I'm really interested to see numbers, and especially numbers of
the simpler approaches before adding this complexity.


  reply	other threads:[~2010-02-01  8:27 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-29 20:49 Slab Fragmentation Reduction V15 Christoph Lameter
2010-01-29 20:49 ` slub: Add defrag_ratio field and sysfs support Christoph Lameter
2010-01-29 20:49 ` slub: Replace ctor field with ops field in /sys/slab/* Christoph Lameter
2010-01-29 20:49 ` slub: Add get() and kick() methods Christoph Lameter
2010-01-29 20:49 ` slub: Sort slab cache list and establish maximum objects for defrag slabs Christoph Lameter
2010-01-29 20:49 ` slub: Slab defrag core Christoph Lameter
2010-01-29 20:49 ` slub: Add KICKABLE to avoid repeated kick() attempts Christoph Lameter
2010-01-29 20:49 ` slub: Extend slabinfo to support -D and -F options Christoph Lameter
2010-01-29 20:49 ` slub/slabinfo: add defrag statistics Christoph Lameter
2010-01-29 20:49 ` slub: Trigger defragmentation from memory reclaim Christoph Lameter
2010-01-29 20:49 ` buffer heads: Support slab defrag Christoph Lameter
2010-01-30  1:59   ` Dave Chinner
2010-02-01  6:39   ` Nick Piggin
2010-01-29 20:49 ` inodes: Support generic defragmentation Christoph Lameter
2010-01-30  2:43   ` Dave Chinner
2010-02-01 17:50     ` Christoph Lameter
2010-01-30 19:26   ` tytso
2010-01-31  8:34     ` Andi Kleen
2010-01-31 13:59       ` Dave Chinner
2010-02-03 15:31         ` Christoph Lameter
2010-02-04  0:34           ` Dave Chinner
2010-02-04  3:07             ` tytso
2010-02-04  3:39               ` Dave Chinner
2010-02-04  9:33                 ` Nick Piggin
2010-02-04 17:13                   ` Christoph Lameter
2010-02-08  7:37                     ` Nick Piggin
2010-02-08 17:40                       ` Christoph Lameter
2010-02-08 22:13                       ` Dave Chinner
2010-02-04 16:59                 ` Christoph Lameter
2010-02-06  0:39                   ` Dave Chinner
2010-01-31 21:02       ` tytso
2010-02-01 10:17         ` Andi Kleen
2010-02-01 13:47           ` tytso
2010-02-01 13:54             ` Andi Kleen
2010-01-29 20:49 ` Filesystem: Ext2 filesystem defrag Christoph Lameter
2010-01-29 20:49 ` Filesystem: Ext3 " Christoph Lameter
2010-01-29 20:49 ` Filesystem: Ext4 " Christoph Lameter
2010-01-29 20:49 ` Filesystem: XFS slab defragmentation Christoph Lameter
2010-01-29 20:49 ` Filesystems: /proc filesystem support for slab defrag Christoph Lameter
2010-01-29 20:49 ` dentries: dentry defragmentation Christoph Lameter
2010-01-29 22:00   ` Al Viro
2010-02-01  7:08     ` Nick Piggin [this message]
2010-02-01 10:10       ` Andi Kleen
2010-02-01 10:16         ` Nick Piggin
2010-02-01 10:22           ` Andi Kleen
2010-02-01 10:35             ` Nick Piggin
2010-02-01 10:45               ` Andi Kleen
2010-02-01 10:56                 ` Nick Piggin
2010-02-01 13:25                   ` Andi Kleen
2010-02-01 13:36                     ` Nick Piggin
2010-01-29 20:49 ` slub defrag: Transition patch upstream -> -next Christoph Lameter
2010-01-30  8:54 ` Slab Fragmentation Reduction V15 Pekka Enberg
2010-01-30 10:48 ` Andi Kleen
2010-01-30 14:53   ` Rik van Riel
2010-02-01 17:53     ` Christoph Lameter
2010-02-01 17:52   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100201070835.GE9085@laptop \
    --to=npiggin@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=cl@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=penberg@cs.helsinki.fi \
    --cc=riel@redhat.com \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=viro@ftp.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox