linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <dchinner@redhat.com>
To: Rik van Riel <riel@redhat.com>
Cc: Ying Han <yinghan@google.com>, Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>, Mel Gorman <mel@csn.ul.ie>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Greg Thelen <gthelen@google.com>,
	Christoph Lameter <cl@linux.com>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	Glauber Costa <glommer@parallels.com>,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH 0/6] memcg: vfs isolation in memory cgroup
Date: Fri, 17 Aug 2012 09:41:57 +1000	[thread overview]
Message-ID: <20120816234157.GB2776@devil.redhat.com> (raw)
In-Reply-To: <502D61E1.8040704@redhat.com>

On Thu, Aug 16, 2012 at 05:10:57PM -0400, Rik van Riel wrote:
> On 08/16/2012 04:53 PM, Ying Han wrote:
> >The patchset adds the functionality of isolating the vfs slab objects per-memcg
> >under reclaim. This feature is a *must-have* after the kernel slab memory
> >accounting which starts charging the slab objects into individual memcgs. The
> >existing per-superblock shrinker doesn't work since it will end up reclaiming
> >slabs being charged to other memcgs.

What list was this posted to?

The per-sb shrinkers are not intended for memcg granularity - they
are for scalability in that they allow the removal of the global
inode and dcache LRU locks and allow significant flexibility in
cache relcaim strategies for filesystems. Hint: reclaiming
the VFS inode cache doesn't free any memory on an XFS filesystem -
it's the XFS inode cache shrinker that is integrated into the per-sb
shrinker infrastructure that frees all the memory. It doesn't work
without the per-sb shrinker functionality and it's an extremely
performance critical balancing act. Hence any changes to this
shrinker infrastructure need a lot of consideration and testing,
most especially to ensure that the balance of the system has not
been disturbed.

Also how do yo propose to solve the problem of inodes and dentries
shared across multiple memcgs?  They can only be tracked in one LRU,
but the caches are global and are globally accessed. Having mem
pressure in a single memcg that causes globally accessed dentries
and inodes to be tossed from memory will simply cause cache
thrashing and performance across the system will tank.

> >The patch now is only handling dentry cache by given the nature dentry pinned
> >inode. Based on the data we've collected, that contributes the main factor of
> >the reclaimable slab objects. We also could make a generic infrastructure for
> >all the shrinkers (if needed).
> 
> Dave Chinner has some prototype code for that.

The patchset I have makes the dcache lru locks per-sb as the first
step to introducing generic per-sb LRU lists, and then builds on
that to provide generic kernel-wide LRU lists with integrated
shrinkers, and builds on that to introduce node-awareness (i.e. NUMA
scalability) into the LRU list so everyone gets scalable shrinkers.

I've looked at memcg awareness in the past, but the problem is the
overhead - the explosion of LRUs because of the per-sb X per-node X
per-memcg object tracking matrix.  It's a huge amount of overhead
and complexity, and unless there's a way of efficiently tracking
objects both per-node and per-memcg simulatneously then I'm of the
opinion that memcg awareness is simply too much trouble, complexity
and overhead to bother with.

So, convince me you can solve the various problems. ;)

Cheers,

Dave.
-- 
Dave Chinner
dchinner@redhat.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-08-16 23:42 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-16 20:53 [RFC PATCH 0/6] memcg: vfs isolation in memory cgroup Ying Han
2012-08-16 21:10 ` Rik van Riel
2012-08-16 23:41   ` Dave Chinner [this message]
2012-08-17  5:15     ` Glauber Costa
2012-08-17  5:40       ` Ying Han
2012-08-17  5:42         ` Glauber Costa
2012-08-17  7:56           ` Dave Chinner
2012-08-19  3:41         ` Andi Kleen
2012-08-17  7:54       ` Dave Chinner
2012-08-17 10:00         ` Glauber Costa
2012-08-17 19:18           ` Ying Han
2012-08-17 14:44         ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120816234157.GB2776@devil.redhat.com \
    --to=dchinner@redhat.com \
    --cc=cl@linux.com \
    --cc=glommer@parallels.com \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).