public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: tytso@mit.edu
To: Andi Kleen <andi@firstfloor.org>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Alexander Viro <viro@ftp.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Christoph Lameter <clameter@sgi.com>,
	Rik van Riel <riel@redhat.com>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	akpm@linux-foundation.org, Nick Piggin <nickpiggin@yahoo.com.au>,
	Hugh Dickins <hugh@veritas.com>,
	linux-kernel@vger.kernel.org
Subject: Re: inodes: Support generic defragmentation
Date: Sun, 31 Jan 2010 16:02:07 -0500	[thread overview]
Message-ID: <20100131210207.GA27883@thunk.org> (raw)
In-Reply-To: <20100131083409.GF29555@one.firstfloor.org>

On Sun, Jan 31, 2010 at 09:34:09AM +0100, Andi Kleen wrote:
> 
> The standard case is the classic updatedb. Lots of dentries/inodes cached
> with no or little corresponding data cache.
> 
> > a huge number of pages that are actively getting used, the thrashing
> > that is going to result is going to enormous.
> 
> I think the consensus so far is to try to avoid any inodes/dentries
> which are dirty or used in any way.

OK, but in that case, the kick_inodes should check to see if the inode
is in use in any way (i.e., has dentries open that will tie it down,
is open, has pages that are dirty or are mapped into some page table)
before attempting to invalidating any of its pages.  The patch as
currently constituted doesn't do that.  It will attempt to drop all
pages owned by that inode before checking for any of these conditions.
If I wanted that, I'd just do "echo 3 > /proc/sys/vm/drop_caches".  

Worse yet, *after* it does this, it tries to write out the pages the
inode.  #1, this is pointless, since if the inode had any dirty pages,
they wouldn't have been invalidated, since it calls write_inode_now()
*after* calling invalidate_mapping_pages(), so the previously dirty
pages will still be mapped, and prevent the the inode from being
flushed.  #2, it interferes with delayed allocation and becomes
another writeback path, which means some dirty pages might get flushed
too early and it does this writeout without any of the congestion
handling code in the bdi writeback paths.

If the consensus is "avoid any inodes/dentries which are dirty or
used in any way", THIS PATCH DOESN'T DO THAT.

I'd go further, and say that it should avoid trying to flush any inode
if any of its sibling inodes on the slab cache are dirty or in use in
any way.  Otherwise, you end up dropping pages from the page cache and
still not be able to do any defragmentation.

> I personally would prefer it to be more aggressive for memory offlining
> though for RAS purposes though, but just handling the unused cases is a 
> good first step.

If you want something more aggressive, why not just "echo 3 >
/proc/sys/vm/drop_caches"?  We have that already.  If the answer is,
because it will trash the performance of the system, I'm concerned
this patch series will do this --- consistently.

If the concern is that the inode cache is filled with crap after an
updatedb run, then we should fix *that* problem; we need a way for
programs like updatedb to indicate that they are scanning lots of
inodes, and if the inode wasn't in cache before it was opened, it
should be placed on the short list to be dropped after it's closed.
Making that a new open(2) flag makes a lot of sense.  Let's solve the
real problem here, if that's the concern.

But most of the time, I *want* the page cache filled, since it means
less time wasted accessing spinning rust platters.  The last thing I
want is a some helpful defragmentation kernel thread constantly
wandering through inode caches, and randomly calling
"invalidate_mapping_pages" on inodes since it thinks this will help
defrag huge pages.  If I'm not running an Oracle database on my
laptop, but instead am concerned about battery lifetime, this is the
last thing I would want.

					- Ted

  parent reply	other threads:[~2010-01-31 21:02 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-29 20:49 Slab Fragmentation Reduction V15 Christoph Lameter
2010-01-29 20:49 ` slub: Add defrag_ratio field and sysfs support Christoph Lameter
2010-01-29 20:49 ` slub: Replace ctor field with ops field in /sys/slab/* Christoph Lameter
2010-01-29 20:49 ` slub: Add get() and kick() methods Christoph Lameter
2010-01-29 20:49 ` slub: Sort slab cache list and establish maximum objects for defrag slabs Christoph Lameter
2010-01-29 20:49 ` slub: Slab defrag core Christoph Lameter
2010-01-29 20:49 ` slub: Add KICKABLE to avoid repeated kick() attempts Christoph Lameter
2010-01-29 20:49 ` slub: Extend slabinfo to support -D and -F options Christoph Lameter
2010-01-29 20:49 ` slub/slabinfo: add defrag statistics Christoph Lameter
2010-01-29 20:49 ` slub: Trigger defragmentation from memory reclaim Christoph Lameter
2010-01-29 20:49 ` buffer heads: Support slab defrag Christoph Lameter
2010-01-30  1:59   ` Dave Chinner
2010-02-01  6:39   ` Nick Piggin
2010-01-29 20:49 ` inodes: Support generic defragmentation Christoph Lameter
2010-01-30  2:43   ` Dave Chinner
2010-02-01 17:50     ` Christoph Lameter
2010-01-30 19:26   ` tytso
2010-01-31  8:34     ` Andi Kleen
2010-01-31 13:59       ` Dave Chinner
2010-02-03 15:31         ` Christoph Lameter
2010-02-04  0:34           ` Dave Chinner
2010-02-04  3:07             ` tytso
2010-02-04  3:39               ` Dave Chinner
2010-02-04  9:33                 ` Nick Piggin
2010-02-04 17:13                   ` Christoph Lameter
2010-02-08  7:37                     ` Nick Piggin
2010-02-08 17:40                       ` Christoph Lameter
2010-02-08 22:13                       ` Dave Chinner
2010-02-04 16:59                 ` Christoph Lameter
2010-02-06  0:39                   ` Dave Chinner
2010-01-31 21:02       ` tytso [this message]
2010-02-01 10:17         ` Andi Kleen
2010-02-01 13:47           ` tytso
2010-02-01 13:54             ` Andi Kleen
2010-01-29 20:49 ` Filesystem: Ext2 filesystem defrag Christoph Lameter
2010-01-29 20:49 ` Filesystem: Ext3 " Christoph Lameter
2010-01-29 20:49 ` Filesystem: Ext4 " Christoph Lameter
2010-01-29 20:49 ` Filesystem: XFS slab defragmentation Christoph Lameter
2010-01-29 20:49 ` Filesystems: /proc filesystem support for slab defrag Christoph Lameter
2010-01-29 20:49 ` dentries: dentry defragmentation Christoph Lameter
2010-01-29 22:00   ` Al Viro
2010-02-01  7:08     ` Nick Piggin
2010-02-01 10:10       ` Andi Kleen
2010-02-01 10:16         ` Nick Piggin
2010-02-01 10:22           ` Andi Kleen
2010-02-01 10:35             ` Nick Piggin
2010-02-01 10:45               ` Andi Kleen
2010-02-01 10:56                 ` Nick Piggin
2010-02-01 13:25                   ` Andi Kleen
2010-02-01 13:36                     ` Nick Piggin
2010-01-29 20:49 ` slub defrag: Transition patch upstream -> -next Christoph Lameter
2010-01-30  8:54 ` Slab Fragmentation Reduction V15 Pekka Enberg
2010-01-30 10:48 ` Andi Kleen
2010-01-30 14:53   ` Rik van Riel
2010-02-01 17:53     ` Christoph Lameter
2010-02-01 17:52   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100131210207.GA27883@thunk.org \
    --to=tytso@mit.edu \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=cl@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=penberg@cs.helsinki.fi \
    --cc=riel@redhat.com \
    --cc=viro@ftp.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox