From: Tejun Heo <tj@kernel.org>
To: Vladimir Davydov <vdavydov@parallels.com>
Cc: Michal Hocko <mhocko@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Christoph Lameter <cl@linux.com>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled
Date: Thu, 3 Sep 2015 12:32:43 -0400 [thread overview]
Message-ID: <20150903163243.GD10394@mtj.duckdns.org> (raw)
In-Reply-To: <20150902093039.GA30160@esperanza>
Hello, Vladimir.
On Wed, Sep 02, 2015 at 12:30:39PM +0300, Vladimir Davydov wrote:
...
> To sum it up. Basically, there are two ways of handling kmemcg charges:
>
> 1. Make the memcg try_charge mimic alloc_pages behavior.
> 2. Make API functions (kmalloc, etc) work in memcg as if they were
> called from the root cgroup, while keeping interactions between the
> low level subsys (slab) and memcg private.
>
> Way 1 might look appealing at the first glance, but at the same time it
> is much more complex, because alloc_pages has grown over the years to
> handle a lot of subtle situations that may arise on global memory
> pressure, but impossible in memcg. What does way 1 give us then? We
> can't insert try_charge directly to alloc_pages and have to spread its
> calls all over the code anyway, so why is it better? Easier to use it in
> places where users depend on buddy allocator peculiarities? There are
> not many such users.
Maybe this is from inexperience but wouldn't 1 also be simpler than
the global case for the same reasons that doing 2 is simpler? It's
not like the fact that memory shortage inside memcg usually doesn't
mean global shortage goes away depending on whether we take 1 or 2.
That said, it is true that slab is an integral part of kmemcg and I
can't see how it can be made oblivious of memcg operations, so yeah
one way or the other slab has to know the details and we may have to
do some unusual things at that layer.
> I understand that the idea of way 1 is to provide a well-defined memcg
> API independent of the rest of the code, but that's just impossible. You
> need special casing anyway. E.g. you need those get/put_kmem_cache
> helpers, which exist solely for SLAB/SLUB. You need all this special
> stuff for growing per-memcg array in list_lru and kmem_cache, which
> exists solely for memcg-vs-list_lru and memcg-vs-slab interactions. We
> even handle kmem_cache destruction on memcg offline differently for SLAB
> and SLUB for performance reasons.
It isn't a black or white thing. Sure, slab should be involved in
kmemcg but at the same time if we can keep the amount of exposure in
check, that's the better way to go.
> Way 2 gives us more space to maneuver IMO. SLAB/SLUB may do weird tricks
> for optimization, but their API is well defined, so we just make kmalloc
> work as expected while providing inter-subsys calls, like
> memcg_charge_slab, for SLAB/SLUB that have their own conventions. You
> mentioned kmem users that allocate memory using alloc_pages. There is an
> API function for them too, alloc_kmem_pages. Everything behind the API
> is hidden and may be done in such a way to achieve optimal performance.
Ditto. Nobody is arguing that we can get it out completely but at the
same time handling of GFP_NOWAIT seems like a pretty fundamental
proprety that we'd wanna maintain at memcg boundary.
You said elsewhere that GFP_NOWAIT happening back-to-back is unlikely.
I'm not sure how much we can commit to that statement. GFP_KERNEL
allocating huge amount of memory in a single go is a kernel bug.
GFP_NOWAIT optimization in a hot path which is accessible to userland
isn't and we'll be growing more and more of them. We need to be
protected against back-to-back GFP_NOWAIT allocations.
Thanks.
--
tejun
next prev parent reply other threads:[~2015-09-03 16:32 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-30 19:02 [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled Vladimir Davydov
2015-08-30 19:02 ` [PATCH 1/2] mm/slab: skip memcg reclaim only if in atomic context Vladimir Davydov
2015-08-30 19:02 ` [PATCH 2/2] mm/slub: do not bypass memcg reclaim for high-order page allocation Vladimir Davydov
2015-08-31 13:24 ` [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled Michal Hocko
2015-08-31 13:43 ` Tejun Heo
2015-08-31 14:30 ` Vladimir Davydov
2015-08-31 14:39 ` Tejun Heo
2015-08-31 15:18 ` Vladimir Davydov
2015-08-31 15:47 ` Tejun Heo
2015-08-31 16:51 ` Vladimir Davydov
2015-08-31 17:03 ` Tejun Heo
2015-08-31 19:26 ` Vladimir Davydov
2015-08-31 20:22 ` Christoph Lameter
2015-09-01 9:25 ` Vladimir Davydov
2015-08-31 14:20 ` Vladimir Davydov
2015-08-31 14:46 ` Tejun Heo
2015-08-31 15:24 ` Vladimir Davydov
2015-09-01 12:36 ` Michal Hocko
2015-09-01 13:40 ` Vladimir Davydov
2015-09-01 15:01 ` Michal Hocko
2015-09-01 16:55 ` Vladimir Davydov
2015-09-01 18:38 ` Michal Hocko
2015-09-02 9:30 ` Vladimir Davydov
2015-09-02 18:16 ` Christoph Lameter
2015-09-03 9:36 ` Vladimir Davydov
2015-09-03 16:32 ` Tejun Heo [this message]
2015-09-04 11:15 ` Vladimir Davydov
2015-09-04 15:44 ` Tejun Heo
2015-09-04 18:21 ` Vladimir Davydov
2015-09-04 19:30 ` Tejun Heo
2015-09-04 14:38 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150903163243.GD10394@mtj.duckdns.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=hannes@cmpxchg.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).