From: Vladimir Davydov <vdavydov@parallels.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: akpm@linux-foundation.org, mhocko@suse.cz, cl@linux.com,
glommer@gmail.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH -mm 0/8] memcg: reparent kmem on css offline
Date: Mon, 7 Jul 2014 19:40:08 +0400 [thread overview]
Message-ID: <20140707154008.GH13827@esperanza> (raw)
In-Reply-To: <20140707142506.GB1149@cmpxchg.org>
Hi Johannes,
On Mon, Jul 07, 2014 at 10:25:06AM -0400, Johannes Weiner wrote:
> Hi Vladimir,
>
> On Mon, Jul 07, 2014 at 04:00:05PM +0400, Vladimir Davydov wrote:
> > Hi,
> >
> > This patch set introduces re-parenting of kmem charges on memcg css
> > offline. The idea lying behind it is very simple - instead of pointing
> > from kmem objects (kmem caches, non-slab kmem pages) directly to the
> > memcg which they are charged against, we make them point to a proxy
> > object, mem_cgroup_kmem_context, which, in turn, points to the memcg
> > which it belongs to. As a result on memcg offline, it's enough to only
> > re-parent the memcg's mem_cgroup_kmem_context.
>
> The motivation for this was to clear out all references to a memcg by
> the time it's offlined, so that the unreachable css can be freed soon.
>
> However, recent cgroup core changes further disconnected the css from
> the cgroup object itself, so it's no longer as urgent to free the css.
>
> In addition, Tejun made offlined css iterable and split css_tryget()
> and css_tryget_online(), which would allow memcg to pin the css until
> the last charge is gone while continuing to iterate and reclaim it on
> hierarchical pressure, even after it was offlined.
>
> This would obviate the need for reparenting as a whole, not just kmem
> pages, but even remaining page cache. Michal already obsoleted the
> force_empty knob that reparents as a fallback, and whether the cache
> pages are in the parent or in a ghost css after cgroup deletion does
> not make a real difference from a user point of view, they still get
> reclaimed when the parent experiences pressure.
So, that means there's no need in a proxy object between kmem objects
and the memcg which they are charged against (mem_cgroup_kmem_context in
this patch set), because now it's OK to pin css from kmem allocations.
Furthermore there will be no need to reparent per memcg list_lrus when
they are introduced. That's nice!
> You could then reap dead slab caches as part of the regular per-memcg
> slab scanning in reclaim, without having to resort to auxiliary lists,
> vmpressure events etc.
Do you mean adding a per memcg shrinker that will call kmem_cache_shrink
for all memcg caches on memcg/global pressure?
Actually I recently made dead caches self-destructive at the cost of
slowing down kfrees to dead caches (see
https://www.lwn.net/Articles/602330/, it's already in the mmotm tree) so
no dead cache reaping is necessary. Do you think if we need it now?
> I think it would save us a lot of code and complexity. You want
> per-memcg slab scanning *anyway*, all we'd have to change in the
> existing code would be to pin the css until the LRUs and kmem caches
> are truly empty, and switch mem_cgroup_iter() to css_tryget().
>
> Would this make sense to you?
Hmm, interesting. Thank you for such a thorough explanation.
One question. Do we still need to free mem_cgroup->kmemcg_id on css
offline so that it can be reused by new kmem-active cgroups (currently
we don't)?
If we won't free it the root_cache->memcg_params->memcg_arrays may
become really huge due to lots of dead css holding the id.
Thanks.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: <akpm@linux-foundation.org>, <mhocko@suse.cz>, <cl@linux.com>,
<glommer@gmail.com>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -mm 0/8] memcg: reparent kmem on css offline
Date: Mon, 7 Jul 2014 19:40:08 +0400 [thread overview]
Message-ID: <20140707154008.GH13827@esperanza> (raw)
In-Reply-To: <20140707142506.GB1149@cmpxchg.org>
Hi Johannes,
On Mon, Jul 07, 2014 at 10:25:06AM -0400, Johannes Weiner wrote:
> Hi Vladimir,
>
> On Mon, Jul 07, 2014 at 04:00:05PM +0400, Vladimir Davydov wrote:
> > Hi,
> >
> > This patch set introduces re-parenting of kmem charges on memcg css
> > offline. The idea lying behind it is very simple - instead of pointing
> > from kmem objects (kmem caches, non-slab kmem pages) directly to the
> > memcg which they are charged against, we make them point to a proxy
> > object, mem_cgroup_kmem_context, which, in turn, points to the memcg
> > which it belongs to. As a result on memcg offline, it's enough to only
> > re-parent the memcg's mem_cgroup_kmem_context.
>
> The motivation for this was to clear out all references to a memcg by
> the time it's offlined, so that the unreachable css can be freed soon.
>
> However, recent cgroup core changes further disconnected the css from
> the cgroup object itself, so it's no longer as urgent to free the css.
>
> In addition, Tejun made offlined css iterable and split css_tryget()
> and css_tryget_online(), which would allow memcg to pin the css until
> the last charge is gone while continuing to iterate and reclaim it on
> hierarchical pressure, even after it was offlined.
>
> This would obviate the need for reparenting as a whole, not just kmem
> pages, but even remaining page cache. Michal already obsoleted the
> force_empty knob that reparents as a fallback, and whether the cache
> pages are in the parent or in a ghost css after cgroup deletion does
> not make a real difference from a user point of view, they still get
> reclaimed when the parent experiences pressure.
So, that means there's no need in a proxy object between kmem objects
and the memcg which they are charged against (mem_cgroup_kmem_context in
this patch set), because now it's OK to pin css from kmem allocations.
Furthermore there will be no need to reparent per memcg list_lrus when
they are introduced. That's nice!
> You could then reap dead slab caches as part of the regular per-memcg
> slab scanning in reclaim, without having to resort to auxiliary lists,
> vmpressure events etc.
Do you mean adding a per memcg shrinker that will call kmem_cache_shrink
for all memcg caches on memcg/global pressure?
Actually I recently made dead caches self-destructive at the cost of
slowing down kfrees to dead caches (see
https://www.lwn.net/Articles/602330/, it's already in the mmotm tree) so
no dead cache reaping is necessary. Do you think if we need it now?
> I think it would save us a lot of code and complexity. You want
> per-memcg slab scanning *anyway*, all we'd have to change in the
> existing code would be to pin the css until the LRUs and kmem caches
> are truly empty, and switch mem_cgroup_iter() to css_tryget().
>
> Would this make sense to you?
Hmm, interesting. Thank you for such a thorough explanation.
One question. Do we still need to free mem_cgroup->kmemcg_id on css
offline so that it can be reused by new kmem-active cgroups (currently
we don't)?
If we won't free it the root_cache->memcg_params->memcg_arrays may
become really huge due to lots of dead css holding the id.
Thanks.
next prev parent reply other threads:[~2014-07-07 15:40 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-07 12:00 [PATCH -mm 0/8] memcg: reparent kmem on css offline Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 12:00 ` [PATCH -mm 1/8] memcg: add pointer from memcg_cache_params to owner cache Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 12:00 ` [PATCH -mm 2/8] memcg: keep all children of each root cache on a list Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 15:24 ` Christoph Lameter
2014-07-07 15:24 ` Christoph Lameter
2014-07-07 15:45 ` Vladimir Davydov
2014-07-07 15:45 ` Vladimir Davydov
2014-07-07 12:00 ` [PATCH -mm 3/8] slab: guarantee unique kmem cache naming Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 12:00 ` [PATCH -mm 4/8] slub: remove kmemcg id from create_unique_id Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 12:00 ` [PATCH -mm 5/8] memcg: rework non-slab kmem pages charge path Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 12:00 ` [PATCH -mm 6/8] memcg: introduce kmem context Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 12:00 ` [PATCH -mm 7/8] memcg: move some kmem definitions upper Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 12:00 ` [PATCH -mm 8/8] memcg: reparent kmem context on memcg offline Vladimir Davydov
2014-07-07 12:00 ` Vladimir Davydov
2014-07-07 14:25 ` [PATCH -mm 0/8] memcg: reparent kmem on css offline Johannes Weiner
2014-07-07 14:25 ` Johannes Weiner
2014-07-07 15:40 ` Vladimir Davydov [this message]
2014-07-07 15:40 ` Vladimir Davydov
2014-07-08 22:05 ` Johannes Weiner
2014-07-08 22:05 ` Johannes Weiner
2014-07-09 7:25 ` Vladimir Davydov
2014-07-09 7:25 ` Vladimir Davydov
2014-07-07 17:14 ` Vladimir Davydov
2014-07-07 17:14 ` Vladimir Davydov
2014-07-08 22:19 ` Johannes Weiner
2014-07-08 22:19 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140707154008.GH13827@esperanza \
--to=vdavydov@parallels.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=glommer@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.