From: Dave Chinner <david@fromorbit.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Christoph Lameter <cl@linux-foundation.org>,
tytso@mit.edu, Andi Kleen <andi@firstfloor.org>,
Miklos Szeredi <miklos@szeredi.hu>,
Alexander Viro <viro@ftp.linux.org.uk>,
Christoph Hellwig <hch@infradead.org>,
Christoph Lameter <clameter@sgi.com>,
Rik van Riel <riel@redhat.com>,
Pekka Enberg <penberg@cs.helsinki.fi>,
akpm@linux-foundation.org, Nick Piggin <nickpiggin@yahoo.com.au>,
Hugh Dickins <hugh@veritas.com>,
linux-kernel@vger.kernel.org
Subject: Re: inodes: Support generic defragmentation
Date: Tue, 9 Feb 2010 09:13:18 +1100 [thread overview]
Message-ID: <20100208221318.GL11483@discord.disaster> (raw)
In-Reply-To: <20100208073753.GC9781@laptop>
On Mon, Feb 08, 2010 at 06:37:53PM +1100, Nick Piggin wrote:
> On Thu, Feb 04, 2010 at 11:13:15AM -0600, Christoph Lameter wrote:
> > On Thu, 4 Feb 2010, Nick Piggin wrote:
> >
> > > Well what I described is to do the slab pinning from the reclaim path
> > > (rather than from slab calling into the subsystem). All slab locking
> > > basically "innermost", so you can pretty much poke the slab layer as
> > > much as you like from the subsystem.
> >
> > Reclaim/defrag is called from the reclaim path (of the VM). We could
> > enable a call from the fs reclaim code into the slab. But how would this
> > work?
>
> Well the exact details will depend, but I feel that things should
> get easier because you pin the object (and therefore the slab) via
> the normal and well tested reclaim paths.
>
> So for example, for dcache, you will come in and take the normal
> locks: dcache_lock, sb_lock, pin the sb, umount_lock. At which
> point you have pinned dentries without changing any locking. So
> then you can find the first entry on the LRU, and should be able
> to then build a list of dentries on the same slab.
>
> You still have the potential issue of now finding objects that would
> not be visible by searching the LRU alone. However at least the
> locking should be simplified.
Very true, but that leads us to the same problem of fragmented
caches because we empty unused objects off slabs that are still
pinned by hot objects and don't free the page. I agree that we can't
totally avoid this problem, but I still think that using an object
based LRU for reclaim has a fundamental mismatch with page based
reclaim that makes this problem worse than it could be.
FWIW, if we change the above to keeping a page based LRU in the slab
cache and the slab picks a page to reclaim, then the problem goes
mostly away, I think. We don't need to pin the slab to select and
prepare a page to reclaim - the cache only needs to be locked before
it starts reclaim. I think this has a much better chance of
reclaiming entire pages in situations where LRU based reclaim will
leave fragmentation.
i.e. instead of:
shrink_slab
-> external shrinker
-> lock cache
-> find reclaimable object
-> call into slab w/ object
-> return longer list of objects
-> reclaim objects
we do:
shrink_slab
-> internal shrinker
-> find oldest page and make object list
-> external shrinker
-> lock cache
-> reclaim objects
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2010-02-08 22:13 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-29 20:49 Slab Fragmentation Reduction V15 Christoph Lameter
2010-01-29 20:49 ` slub: Add defrag_ratio field and sysfs support Christoph Lameter
2010-01-29 20:49 ` slub: Replace ctor field with ops field in /sys/slab/* Christoph Lameter
2010-01-29 20:49 ` slub: Add get() and kick() methods Christoph Lameter
2010-01-29 20:49 ` slub: Sort slab cache list and establish maximum objects for defrag slabs Christoph Lameter
2010-01-29 20:49 ` slub: Slab defrag core Christoph Lameter
2010-01-29 20:49 ` slub: Add KICKABLE to avoid repeated kick() attempts Christoph Lameter
2010-01-29 20:49 ` slub: Extend slabinfo to support -D and -F options Christoph Lameter
2010-01-29 20:49 ` slub/slabinfo: add defrag statistics Christoph Lameter
2010-01-29 20:49 ` slub: Trigger defragmentation from memory reclaim Christoph Lameter
2010-01-29 20:49 ` buffer heads: Support slab defrag Christoph Lameter
2010-01-30 1:59 ` Dave Chinner
2010-02-01 6:39 ` Nick Piggin
2010-01-29 20:49 ` inodes: Support generic defragmentation Christoph Lameter
2010-01-30 2:43 ` Dave Chinner
2010-02-01 17:50 ` Christoph Lameter
2010-01-30 19:26 ` tytso
2010-01-31 8:34 ` Andi Kleen
2010-01-31 13:59 ` Dave Chinner
2010-02-03 15:31 ` Christoph Lameter
2010-02-04 0:34 ` Dave Chinner
2010-02-04 3:07 ` tytso
2010-02-04 3:39 ` Dave Chinner
2010-02-04 9:33 ` Nick Piggin
2010-02-04 17:13 ` Christoph Lameter
2010-02-08 7:37 ` Nick Piggin
2010-02-08 17:40 ` Christoph Lameter
2010-02-08 22:13 ` Dave Chinner [this message]
2010-02-04 16:59 ` Christoph Lameter
2010-02-06 0:39 ` Dave Chinner
2010-01-31 21:02 ` tytso
2010-02-01 10:17 ` Andi Kleen
2010-02-01 13:47 ` tytso
2010-02-01 13:54 ` Andi Kleen
2010-01-29 20:49 ` Filesystem: Ext2 filesystem defrag Christoph Lameter
2010-01-29 20:49 ` Filesystem: Ext3 " Christoph Lameter
2010-01-29 20:49 ` Filesystem: Ext4 " Christoph Lameter
2010-01-29 20:49 ` Filesystem: XFS slab defragmentation Christoph Lameter
2010-01-29 20:49 ` Filesystems: /proc filesystem support for slab defrag Christoph Lameter
2010-01-29 20:49 ` dentries: dentry defragmentation Christoph Lameter
2010-01-29 22:00 ` Al Viro
2010-02-01 7:08 ` Nick Piggin
2010-02-01 10:10 ` Andi Kleen
2010-02-01 10:16 ` Nick Piggin
2010-02-01 10:22 ` Andi Kleen
2010-02-01 10:35 ` Nick Piggin
2010-02-01 10:45 ` Andi Kleen
2010-02-01 10:56 ` Nick Piggin
2010-02-01 13:25 ` Andi Kleen
2010-02-01 13:36 ` Nick Piggin
2010-01-29 20:49 ` slub defrag: Transition patch upstream -> -next Christoph Lameter
2010-01-30 8:54 ` Slab Fragmentation Reduction V15 Pekka Enberg
2010-01-30 10:48 ` Andi Kleen
2010-01-30 14:53 ` Rik van Riel
2010-02-01 17:53 ` Christoph Lameter
2010-02-01 17:52 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100208221318.GL11483@discord.disaster \
--to=david@fromorbit.com \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=cl@linux-foundation.org \
--cc=clameter@sgi.com \
--cc=hch@infradead.org \
--cc=hugh@veritas.com \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=nickpiggin@yahoo.com.au \
--cc=npiggin@suse.de \
--cc=penberg@cs.helsinki.fi \
--cc=riel@redhat.com \
--cc=tytso@mit.edu \
--cc=viro@ftp.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.