From: Roman Gushchin <guro@fb.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Linux MM <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>,
kernel-team@fb.com, Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
luto@kernel.org, Konstantin Khlebnikov <koct9i@gmail.com>,
Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v2 1/3] mm: rework memcg kernel stack accounting
Date: Wed, 29 Aug 2018 14:24:25 -0700 [thread overview]
Message-ID: <20180829212422.GA13097@castle> (raw)
In-Reply-To: <CALvZod4HAf+iPXQx1v+dwJkTph3ySAiYo4kn4d2jRFNQS59Tgg@mail.gmail.com>
On Tue, Aug 21, 2018 at 03:10:52PM -0700, Shakeel Butt wrote:
> On Tue, Aug 21, 2018 at 2:36 PM Roman Gushchin <guro@fb.com> wrote:
> >
> > If CONFIG_VMAP_STACK is set, kernel stacks are allocated
> > using __vmalloc_node_range() with __GFP_ACCOUNT. So kernel
> > stack pages are charged against corresponding memory cgroups
> > on allocation and uncharged on releasing them.
> >
> > The problem is that we do cache kernel stacks in small
> > per-cpu caches and do reuse them for new tasks, which can
> > belong to different memory cgroups.
> >
> > Each stack page still holds a reference to the original cgroup,
> > so the cgroup can't be released until the vmap area is released.
> >
> > To make this happen we need more than two subsequent exits
> > without forks in between on the current cpu, which makes it
> > very unlikely to happen. As a result, I saw a significant number
> > of dying cgroups (in theory, up to 2 * number_of_cpu +
> > number_of_tasks), which can't be released even by significant
> > memory pressure.
> >
> > As a cgroup structure can take a significant amount of memory
> > (first of all, per-cpu data like memcg statistics), it leads
> > to a noticeable waste of memory.
> >
> > Signed-off-by: Roman Gushchin <guro@fb.com>
>
> Reviewed-by: Shakeel Butt <shakeelb@google.com>
>
> BTW this makes a very good use-case for optimizing kmem uncharging
> similar to what you did for skmem uncharging.
The only thing I'm slightly worried here is that it can make
reclaiming of memory cgroups harder. Probably, it's still ok,
but let me first finish the work I'm doing on optimizing the
whole memcg reclaim process, and then return to this case.
Thanks!
next prev parent reply other threads:[~2018-08-29 21:24 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-21 21:35 [PATCH v2 1/3] mm: rework memcg kernel stack accounting Roman Gushchin
2018-08-21 21:35 ` [PATCH v2 2/3] mm: drain memcg stocks on css offlining Roman Gushchin
2018-08-21 21:35 ` [PATCH v2 3/3] mm: don't miss the last page because of round-off error Roman Gushchin
2018-08-21 22:10 ` [PATCH v2 1/3] mm: rework memcg kernel stack accounting Shakeel Butt
2018-08-21 22:15 ` Roman Gushchin
2018-08-29 21:24 ` Roman Gushchin [this message]
2018-08-29 21:30 ` Shakeel Butt
2018-08-22 14:12 ` Michal Hocko
2018-08-23 16:23 ` Roman Gushchin
2018-08-24 7:52 ` Michal Hocko
2018-08-24 12:50 ` Johannes Weiner
2018-08-24 15:42 ` Roman Gushchin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180829212422.GA13097@castle \
--to=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=koct9i@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mhocko@kernel.org \
--cc=shakeelb@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).