From: Vladimir Davydov <vdavydov@parallels.com>
To: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>, Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled
Date: Tue, 1 Sep 2015 12:25:20 +0300 [thread overview]
Message-ID: <20150901092520.GA21226@esperanza> (raw)
In-Reply-To: <alpine.DEB.2.11.1508311521040.30405@east.gentwo.org>
On Mon, Aug 31, 2015 at 03:22:22PM -0500, Christoph Lameter wrote:
> On Mon, 31 Aug 2015, Vladimir Davydov wrote:
>
> > I totally agree that we should strive to make a kmem user feel roughly
> > the same in memcg as if it were running on a host with equal amount of
> > RAM. There are two ways to achieve that:
> >
> > 1. Make the API functions, i.e. kmalloc and friends, behave inside
> > memcg roughly the same way as they do in the root cgroup.
> > 2. Make the internal memcg functions, i.e. try_charge and friends,
> > behave roughly the same way as alloc_pages.
> >
> > I find way 1 more flexible, because we don't have to blindly follow
> > heuristics used on global memory reclaim and therefore have more
> > opportunities to achieve the same goal.
>
> The heuristics need to integrate well if its in a cgroup or not. In
> general make use of cgroups as transparent as possible to the rest of the
> code.
Half of kmem accounting implementation resides in SLAB/SLUB. We can't
just make use of cgroups there transparent. For the rest of the code
using kmalloc, cgroups are transparent.
Indeed, we can make memcg_charge_slab behave exactly like alloc_pages,
we can even put it to alloc_pages (where it used to be), but why if the
only user of memcg_charge_slab is SLAB/SLUB core?
I think we'd have more space to manoeuvre if we just taught SLAB/SLUB to
use memcg_charge_slab wisely (as it used to until recently), because
memcg charge/reclaim is quite different from global alloc/reclaim:
- it isn't aware of NUMA nodes, so trying to charge w/o __GFP_WAIT
while inspecting nodes, like in case of SLAB, is meaningless
- it isn't aware of high order page allocations, so trying to charge
w/o __GFP_WAIT while trying optimistically to get a high order page,
like in case of SLUB, is meaningless too
- it can always let a high prio allocation go unaccounted, so IMO there
is no point in introducing emergency reserves (__GFP_MEMALLOC
handling)
- it can always charge a GFP_NOWAIT allocation even if it exceeds the
limit, issuing direct reclaim when a GFP_KERNEL allocation comes or
from a task work, because there is no risk of depleting memory
reserves; so it isn't obvious to me whether we really need an aync
thread handling memcg reclaim like kswapd
Thanks,
Vladimir
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@parallels.com>
To: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>, Michal Hocko <mhocko@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Pekka Enberg <penberg@kernel.org>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled
Date: Tue, 1 Sep 2015 12:25:20 +0300 [thread overview]
Message-ID: <20150901092520.GA21226@esperanza> (raw)
In-Reply-To: <alpine.DEB.2.11.1508311521040.30405@east.gentwo.org>
On Mon, Aug 31, 2015 at 03:22:22PM -0500, Christoph Lameter wrote:
> On Mon, 31 Aug 2015, Vladimir Davydov wrote:
>
> > I totally agree that we should strive to make a kmem user feel roughly
> > the same in memcg as if it were running on a host with equal amount of
> > RAM. There are two ways to achieve that:
> >
> > 1. Make the API functions, i.e. kmalloc and friends, behave inside
> > memcg roughly the same way as they do in the root cgroup.
> > 2. Make the internal memcg functions, i.e. try_charge and friends,
> > behave roughly the same way as alloc_pages.
> >
> > I find way 1 more flexible, because we don't have to blindly follow
> > heuristics used on global memory reclaim and therefore have more
> > opportunities to achieve the same goal.
>
> The heuristics need to integrate well if its in a cgroup or not. In
> general make use of cgroups as transparent as possible to the rest of the
> code.
Half of kmem accounting implementation resides in SLAB/SLUB. We can't
just make use of cgroups there transparent. For the rest of the code
using kmalloc, cgroups are transparent.
Indeed, we can make memcg_charge_slab behave exactly like alloc_pages,
we can even put it to alloc_pages (where it used to be), but why if the
only user of memcg_charge_slab is SLAB/SLUB core?
I think we'd have more space to manoeuvre if we just taught SLAB/SLUB to
use memcg_charge_slab wisely (as it used to until recently), because
memcg charge/reclaim is quite different from global alloc/reclaim:
- it isn't aware of NUMA nodes, so trying to charge w/o __GFP_WAIT
while inspecting nodes, like in case of SLAB, is meaningless
- it isn't aware of high order page allocations, so trying to charge
w/o __GFP_WAIT while trying optimistically to get a high order page,
like in case of SLUB, is meaningless too
- it can always let a high prio allocation go unaccounted, so IMO there
is no point in introducing emergency reserves (__GFP_MEMALLOC
handling)
- it can always charge a GFP_NOWAIT allocation even if it exceeds the
limit, issuing direct reclaim when a GFP_KERNEL allocation comes or
from a task work, because there is no risk of depleting memory
reserves; so it isn't obvious to me whether we really need an aync
thread handling memcg reclaim like kswapd
Thanks,
Vladimir
next prev parent reply other threads:[~2015-09-01 9:25 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-30 19:02 [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled Vladimir Davydov
2015-08-30 19:02 ` Vladimir Davydov
2015-08-30 19:02 ` [PATCH 1/2] mm/slab: skip memcg reclaim only if in atomic context Vladimir Davydov
2015-08-30 19:02 ` Vladimir Davydov
2015-08-30 19:02 ` [PATCH 2/2] mm/slub: do not bypass memcg reclaim for high-order page allocation Vladimir Davydov
2015-08-30 19:02 ` Vladimir Davydov
2015-08-31 13:24 ` [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled Michal Hocko
2015-08-31 13:24 ` Michal Hocko
2015-08-31 13:43 ` Tejun Heo
2015-08-31 13:43 ` Tejun Heo
2015-08-31 14:30 ` Vladimir Davydov
2015-08-31 14:30 ` Vladimir Davydov
2015-08-31 14:39 ` Tejun Heo
2015-08-31 14:39 ` Tejun Heo
2015-08-31 15:18 ` Vladimir Davydov
2015-08-31 15:18 ` Vladimir Davydov
2015-08-31 15:47 ` Tejun Heo
2015-08-31 15:47 ` Tejun Heo
2015-08-31 16:51 ` Vladimir Davydov
2015-08-31 16:51 ` Vladimir Davydov
2015-08-31 17:03 ` Tejun Heo
2015-08-31 17:03 ` Tejun Heo
2015-08-31 19:26 ` Vladimir Davydov
2015-08-31 19:26 ` Vladimir Davydov
2015-08-31 20:22 ` Christoph Lameter
2015-08-31 20:22 ` Christoph Lameter
2015-09-01 9:25 ` Vladimir Davydov [this message]
2015-09-01 9:25 ` Vladimir Davydov
2015-08-31 14:20 ` Vladimir Davydov
2015-08-31 14:20 ` Vladimir Davydov
2015-08-31 14:46 ` Tejun Heo
2015-08-31 14:46 ` Tejun Heo
2015-08-31 15:24 ` Vladimir Davydov
2015-08-31 15:24 ` Vladimir Davydov
2015-09-01 12:36 ` Michal Hocko
2015-09-01 12:36 ` Michal Hocko
2015-09-01 13:40 ` Vladimir Davydov
2015-09-01 13:40 ` Vladimir Davydov
2015-09-01 15:01 ` Michal Hocko
2015-09-01 15:01 ` Michal Hocko
2015-09-01 16:55 ` Vladimir Davydov
2015-09-01 16:55 ` Vladimir Davydov
2015-09-01 18:38 ` Michal Hocko
2015-09-01 18:38 ` Michal Hocko
2015-09-02 9:30 ` Vladimir Davydov
2015-09-02 9:30 ` Vladimir Davydov
2015-09-02 18:16 ` Christoph Lameter
2015-09-02 18:16 ` Christoph Lameter
2015-09-03 9:36 ` Vladimir Davydov
2015-09-03 9:36 ` Vladimir Davydov
2015-09-03 16:32 ` Tejun Heo
2015-09-03 16:32 ` Tejun Heo
2015-09-04 11:15 ` Vladimir Davydov
2015-09-04 11:15 ` Vladimir Davydov
2015-09-04 15:44 ` Tejun Heo
2015-09-04 15:44 ` Tejun Heo
2015-09-04 18:21 ` Vladimir Davydov
2015-09-04 18:21 ` Vladimir Davydov
2015-09-04 19:30 ` Tejun Heo
2015-09-04 19:30 ` Tejun Heo
2015-09-04 14:38 ` Michal Hocko
2015-09-04 14:38 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150901092520.GA21226@esperanza \
--to=vdavydov@parallels.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penberg@kernel.org \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.