From: Vladimir Davydov <vdavydov@parallels.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>, Greg Thelen <gthelen@google.com>,
Dave Chinner <david@fromorbit.com>,
Glauber Costa <glommer@gmail.com>,
Suleiman Souhlal <suleiman@google.com>,
Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Tejun Heo <tj@kernel.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org
Subject: Re: [PATCH -mm 00/14] Per memcg slab shrinkers
Date: Mon, 29 Sep 2014 11:02:53 +0400 [thread overview]
Message-ID: <20140929070252.GA16447@esperanza> (raw)
In-Reply-To: <cover.1411301245.git.vdavydov@parallels.com>
ping
On Sun, Sep 21, 2014 at 07:14:32PM +0400, Vladimir Davydov wrote:
> Hi,
>
> Kmem accounting of memcg is unusable now, because it lacks slab shrinker
> support. That means when we hit the limit we will get ENOMEM w/o any
> chance to recover. What we should do then is to call shrink_slab, which
> would reclaim old inode/dentry caches from this cgroup. This is what
> this patch set is intended to do.
>
> Basically, it does two things. First, it introduces the notion of
> per-memcg slab shrinker. A shrinker that wants to reclaim objects per
> cgroup should mark itself as SHRINKER_MEMCG_AWARE. Then it will be
> passed the memory cgroup to scan from in shrink_control->memcg. For such
> shrinkers shrink_slab iterates over the whole cgroup subtree under the
> target cgroup and calls the shrinker for each kmem-active memory cgroup.
>
> Secondly, this patch set makes the list_lru structure per-memcg. It's
> done transparently to list_lru users - everything they have to do is to
> tell list_lru_init that they want memcg-aware list_lru. Then the
> list_lru will automatically distribute objects among per-memcg lists
> basing on which cgroup the object is accounted to. This way to make FS
> shrinkers (icache, dcache) memcg-aware we only need to make them use
> memcg-aware list_lru, and this is what this patch set does.
>
> The main difference of this patch set from my previous attempts to push
> memcg aware shrinkers is in how it handles css offline. Now we don't let
> list_lrus corresponding to dead memory cgroups hang around till all
> objects are freed. Instead we move lru items to the parent cgroup's lru
> list. This is really important, because this allows us to release
> memcg_cache_id used for indexing in per-memcg arrays. If we don't do
> this, the arrays will grow uncontrollably, which is really bad. Note, in
> comparison to user memory reparenting, which Johannes is going to get
> rid of, it's not racy and much easier to implement although it does
> impose some limitations on how list_lru locking can be implemented.
> Another difference is that it doesn't reparent charges, only list_lru
> entries - the css will be dangling until the last kmem object is freed.
>
> As before, this patch set only enables per-memcg kmem reclaim when the
> pressure goes from memory.limit, not from memory.kmem.limit. Handling
> memory.kmem.limit is going to be tricky due to GFP_NOFS allocations, it
> will probably require a sort of soft limit to work properly. I'm leaving
> this for future work.
>
> The patch set basically consists of three main parts and organized as
> follows:
>
> - Patches 1-3 implement per-memcg shrinker core with patches 1 and 2
> preparing list_lru users for upcoming changes and patch 3 tuning
> shrink_slab.
>
> - Patches 4-10 make memcg core release cache ids on offline doing a bit
> of cleanup in the meanwhile. This is easy, because kmem_caches don't
> need the cache id after css offline since there can't be allocations
> going from a dead memcg. Note that most of these patches (namely 4-6,
> and 8) were once merged, but then I decided to drop them, because I
> didn't know how to deal with list_lrus at that time (see
> https://lkml.org/lkml/2014/7/23/218).
>
> - Finally patches 11-14 make list_lru per-memcg and mark FS shrinkers
> as memcg-aware. This is the most difficult part of this patch set
> with patch 13 (unlucky :-) doing the most important work.
>
> Reviews are more than welcome.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>, Greg Thelen <gthelen@google.com>,
Dave Chinner <david@fromorbit.com>,
Glauber Costa <glommer@gmail.com>,
Suleiman Souhlal <suleiman@google.com>,
Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Tejun Heo <tj@kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-mm@kvack.org>, <cgroups@vger.kernel.org>
Subject: Re: [PATCH -mm 00/14] Per memcg slab shrinkers
Date: Mon, 29 Sep 2014 11:02:53 +0400 [thread overview]
Message-ID: <20140929070252.GA16447@esperanza> (raw)
In-Reply-To: <cover.1411301245.git.vdavydov@parallels.com>
ping
On Sun, Sep 21, 2014 at 07:14:32PM +0400, Vladimir Davydov wrote:
> Hi,
>
> Kmem accounting of memcg is unusable now, because it lacks slab shrinker
> support. That means when we hit the limit we will get ENOMEM w/o any
> chance to recover. What we should do then is to call shrink_slab, which
> would reclaim old inode/dentry caches from this cgroup. This is what
> this patch set is intended to do.
>
> Basically, it does two things. First, it introduces the notion of
> per-memcg slab shrinker. A shrinker that wants to reclaim objects per
> cgroup should mark itself as SHRINKER_MEMCG_AWARE. Then it will be
> passed the memory cgroup to scan from in shrink_control->memcg. For such
> shrinkers shrink_slab iterates over the whole cgroup subtree under the
> target cgroup and calls the shrinker for each kmem-active memory cgroup.
>
> Secondly, this patch set makes the list_lru structure per-memcg. It's
> done transparently to list_lru users - everything they have to do is to
> tell list_lru_init that they want memcg-aware list_lru. Then the
> list_lru will automatically distribute objects among per-memcg lists
> basing on which cgroup the object is accounted to. This way to make FS
> shrinkers (icache, dcache) memcg-aware we only need to make them use
> memcg-aware list_lru, and this is what this patch set does.
>
> The main difference of this patch set from my previous attempts to push
> memcg aware shrinkers is in how it handles css offline. Now we don't let
> list_lrus corresponding to dead memory cgroups hang around till all
> objects are freed. Instead we move lru items to the parent cgroup's lru
> list. This is really important, because this allows us to release
> memcg_cache_id used for indexing in per-memcg arrays. If we don't do
> this, the arrays will grow uncontrollably, which is really bad. Note, in
> comparison to user memory reparenting, which Johannes is going to get
> rid of, it's not racy and much easier to implement although it does
> impose some limitations on how list_lru locking can be implemented.
> Another difference is that it doesn't reparent charges, only list_lru
> entries - the css will be dangling until the last kmem object is freed.
>
> As before, this patch set only enables per-memcg kmem reclaim when the
> pressure goes from memory.limit, not from memory.kmem.limit. Handling
> memory.kmem.limit is going to be tricky due to GFP_NOFS allocations, it
> will probably require a sort of soft limit to work properly. I'm leaving
> this for future work.
>
> The patch set basically consists of three main parts and organized as
> follows:
>
> - Patches 1-3 implement per-memcg shrinker core with patches 1 and 2
> preparing list_lru users for upcoming changes and patch 3 tuning
> shrink_slab.
>
> - Patches 4-10 make memcg core release cache ids on offline doing a bit
> of cleanup in the meanwhile. This is easy, because kmem_caches don't
> need the cache id after css offline since there can't be allocations
> going from a dead memcg. Note that most of these patches (namely 4-6,
> and 8) were once merged, but then I decided to drop them, because I
> didn't know how to deal with list_lrus at that time (see
> https://lkml.org/lkml/2014/7/23/218).
>
> - Finally patches 11-14 make list_lru per-memcg and mark FS shrinkers
> as memcg-aware. This is the most difficult part of this patch set
> with patch 13 (unlucky :-) doing the most important work.
>
> Reviews are more than welcome.
next prev parent reply other threads:[~2014-09-29 7:02 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-21 15:14 [PATCH -mm 00/14] Per memcg slab shrinkers Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 01/14] list_lru: introduce list_lru_shrink_{count,walk} Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 02/14] fs: consolidate {nr,free}_cached_objects args in shrink_control Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 03/14] vmscan: shrink slab on memcg pressure Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 04/14] memcg: use mem_cgroup_id for per memcg cache naming Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 05/14] memcg: add pointer to owner cache to memcg_cache_params Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 06/14] memcg: keep all children of each root cache on a list Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 07/14] memcg: update memcg_caches array entries on the slab side Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 08/14] memcg: release memcg_cache_id on css offline Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 09/14] memcg: rename some cache id related variables Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 10/14] memcg: add rwsem to sync against memcg_caches arrays relocation Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 11/14] list_lru: get rid of ->active_nodes Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 12/14] list_lru: organize all list_lrus to list Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 13/14] list_lru: introduce per-memcg lists Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 15:14 ` [PATCH -mm 14/14] fs: make shrinker memcg aware Vladimir Davydov
2014-09-21 15:14 ` Vladimir Davydov
2014-09-21 16:00 ` [PATCH -mm 00/14] Per memcg slab shrinkers Tejun Heo
2014-09-21 16:00 ` Tejun Heo
2014-09-22 7:04 ` Vladimir Davydov
2014-09-22 7:04 ` Vladimir Davydov
2014-09-29 7:02 ` Vladimir Davydov [this message]
2014-09-29 7:02 ` Vladimir Davydov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140929070252.GA16447@esperanza \
--to=vdavydov@parallels.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=david@fromorbit.com \
--cc=glommer@gmail.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=suleiman@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.