From: Tejun Heo <tj@kernel.org>
To: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
Pekka Enberg <penberg@kernel.org>,
Christoph Lameter <cl@linux-foundation.org>,
Li Zefan <lizefan@huawei.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org
Subject: Re: [patch 7/8] mm, memcg: allow processes handling oom notifications to access reserves
Date: Wed, 11 Dec 2013 07:42:40 -0500 [thread overview]
Message-ID: <20131211124240.GA24557@htj.dyndns.org> (raw)
In-Reply-To: <alpine.DEB.2.02.1312101522400.22701@chino.kir.corp.google.com>
Yo,
On Tue, Dec 10, 2013 at 03:55:48PM -0800, David Rientjes wrote:
> > Well, the gotcha there is that you won't be able to do that with
> > system level OOM handler either unless you create a separately
> > reserved memory, which, again, can be achieved using hierarchical
> > memcg setup already. Am I missing something here?
>
> System oom conditions would only arise when the usage of memcgs A + B
> above cause the page allocator to not be able to allocate memory without
> oom killing something even though the limits of both A and B may not have
> been reached yet. No userspace oom handler can allocate memory with
> access to memory reserves in the page allocator in such a context; it's
> vital that if we are to handle system oom conditions in userspace that we
> given them access to memory that other processes can't allocate. You
> could attach a userspace system oom handler to any memcg in this scenario
> with memory.oom_reserve_in_bytes and since it has PF_OOM_HANDLER it would
> be able to allocate in reserves in the page allocator and overcharge in
> its memcg to handle it. This isn't possible only with a hierarchical
> memcg setup unless you ensure the sum of the limits of the top level
> memcgs do not equal or exceed the sum of the min watermarks of all memory
> zones, and we exceed that.
Yes, exactly. If system memory is 128M, create top level memcgs w/
120M and 8M each (well, with some slack of course) and then overcommit
the descendants of 120M while putting OOM handlers and friends under
8M without overcommitting.
...
> The stronger rationale is that you can't handle system oom in userspace
> without this functionality and we need to do so.
You're giving yourself an unreasonable precondition - overcommitting
at root level and handling system OOM from userland - and then trying
to contort everything to fit that. How can possibly "overcommitting
at root level" be a goal of and in itself? Please take a step back
and look at and explain the *problem* you're trying to solve. You
haven't explained why that *need*s to be the case at all.
I wrote this at the start of the thread but you're still doing the
same thing. You're trying to create a hidden memcg level inside a
memcg. At the beginning of this thread, you were trying to do that
for !root memcgs and now you're arguing that you *need* that for root
memcg. Because there's no other limit we can make use of, you're
suggesting the use of kernel reserve memory for that purpose. It
seems like an absurd thing to do to me. It could be that you might
not be able to achieve exactly the same thing that way, but the right
thing to do would be improving memcg in general so that it can instead
of adding yet more layer of half-baked complexity, right?
Even if there are some inherent advantages of system userland OOM
handling with a separate physical memory reserve, which AFAICS you
haven't succeeded at showing yet, this is a very invasive change and,
as you said before, something with an *extremely* narrow use case.
Wouldn't it be a better idea to improve the existing mechanisms - be
that memcg in general or kernel OOM handling - to fit the niche use
case better? I mean, just think about all the corner cases. How are
you gonna handle priority inversion through locked pages or
allocations given out to other tasks through slab? You're suggesting
opening a giant can of worms for extremely narrow benefit which
doesn't even seem like actually needing opening the said can.
Thanks.
--
tejun
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-12-11 12:42 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-19 13:14 user defined OOM policies Michal Hocko
2013-11-19 13:40 ` Michal Hocko
2013-11-20 8:02 ` David Rientjes
2013-11-20 15:22 ` Michal Hocko
2013-11-20 17:14 ` Luigi Semenzato
2013-11-21 3:36 ` David Rientjes
2013-11-21 7:03 ` Luigi Semenzato
2013-11-22 18:08 ` Johannes Weiner
2013-11-28 11:36 ` Michal Hocko
2013-11-26 1:29 ` David Rientjes
2013-11-28 11:42 ` Michal Hocko
2013-12-02 23:09 ` David Rientjes
2013-11-21 3:33 ` David Rientjes
2013-11-28 11:54 ` Michal Hocko
2013-12-02 23:07 ` David Rientjes
2013-12-04 5:19 ` [patch 1/8] fork: collapse copy_flags into copy_process David Rientjes
2013-12-04 5:19 ` [patch 2/8] mm, mempolicy: rename slab_node for clarity David Rientjes
2013-12-04 15:21 ` Christoph Lameter
2013-12-04 5:20 ` [patch 3/8] mm, mempolicy: remove per-process flag David Rientjes
2013-12-04 15:24 ` Christoph Lameter
2013-12-05 0:53 ` David Rientjes
2013-12-05 19:05 ` Christoph Lameter
2013-12-05 23:53 ` David Rientjes
2013-12-06 14:46 ` Christoph Lameter
2013-12-04 5:20 ` [patch 4/8] mm, memcg: add tunable for oom reserves David Rientjes
2013-12-04 5:20 ` [patch 5/8] res_counter: remove interface for locked charging and uncharging David Rientjes
2013-12-04 5:20 ` [patch 6/8] res_counter: add interface for maximum nofail charge David Rientjes
2013-12-04 5:20 ` [patch 7/8] mm, memcg: allow processes handling oom notifications to access reserves David Rientjes
2013-12-04 5:45 ` Johannes Weiner
2013-12-05 1:49 ` David Rientjes
2013-12-05 2:50 ` Tejun Heo
2013-12-05 23:49 ` David Rientjes
2013-12-06 17:34 ` Johannes Weiner
2013-12-07 16:38 ` Tim Hockin
2013-12-07 17:40 ` Johannes Weiner
2013-12-07 18:12 ` Tim Hockin
2013-12-07 19:06 ` Johannes Weiner
2013-12-07 21:04 ` Tim Hockin
2013-12-06 19:01 ` Tejun Heo
2013-12-09 20:10 ` David Rientjes
2013-12-09 22:37 ` Johannes Weiner
2013-12-10 21:50 ` Tejun Heo
2013-12-10 23:55 ` David Rientjes
2013-12-11 9:49 ` Mel Gorman
2013-12-11 12:42 ` Tejun Heo [this message]
2013-12-12 5:37 ` Tim Hockin
2013-12-12 14:21 ` Tejun Heo
2013-12-12 16:32 ` Michal Hocko
2013-12-12 16:37 ` Tejun Heo
2013-12-12 18:42 ` Tim Hockin
2013-12-12 19:23 ` Tejun Heo
2013-12-13 0:23 ` Tim Hockin
2013-12-13 11:47 ` Tejun Heo
2013-12-04 5:20 ` [patch 8/8] mm, memcg: add memcg oom reserve documentation David Rientjes
2013-11-20 17:25 ` user defined OOM policies Vladimir Murzin
2013-11-20 17:21 ` Vladimir Murzin
2013-11-20 17:33 ` Michal Hocko
2013-11-21 3:38 ` David Rientjes
2013-11-21 17:13 ` Michal Hocko
2013-11-26 1:36 ` David Rientjes
2013-11-22 7:28 ` Vladimir Murzin
2013-11-22 13:18 ` Michal Hocko
2013-11-20 7:50 ` David Rientjes
2013-11-22 0:19 ` Jörn Engel
2013-11-26 1:31 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131211124240.GA24557@htj.dyndns.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=cl@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan@huawei.com \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=penberg@kernel.org \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).