From: Roman Gushchin <guro@fb.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
kernel-team@fb.com, linux-mm@kvack.org,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
Andrew Morton <akpm@linux-foundation.org>,
cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [v8 0/4] cgroup-aware OOM killer
Date: Wed, 27 Sep 2017 10:57:56 +0100 [thread overview]
Message-ID: <20170927095756.GA4159@castle> (raw)
In-Reply-To: <20170927073744.5g7dq5c5spmtgz5g@dhcp22.suse.cz>
On Wed, Sep 27, 2017 at 09:37:44AM +0200, Michal Hocko wrote:
> On Tue 26-09-17 14:04:41, David Rientjes wrote:
> > On Tue, 26 Sep 2017, Michal Hocko wrote:
> >
> > > > No, I agree that we shouldn't compare sibling memory cgroups based on
> > > > different criteria depending on whether group_oom is set or not.
> > > >
> > > > I think it would be better to compare siblings based on the same criteria
> > > > independent of group_oom if the user has mounted the hierarchy with the
> > > > new mode (I think we all agree that the mount option is needed). It's
> > > > very easy to describe to the user and the selection is simple to
> > > > understand.
> > >
> > > I disagree. Just take the most simplistic example when cgroups reflect
> > > some other higher level organization - e.g. school with teachers,
> > > students and admins as the top level cgroups to control the proper cpu
> > > share load. Now you want to have a fair OOM selection between different
> > > entities. Do you consider selecting students all the time as an expected
> > > behavior just because their are the largest group? This just doesn't
> > > make any sense to me.
> > >
> >
> > Are you referring to this?
> >
> > root
> > / \
> > students admins
> > / \ / \
> > A B C D
> >
> > If the cumulative usage of all students exceeds the cumulative usage of
> > all admins, yes, the choice is to kill from the /students tree.
>
> Which is wrong IMHO because the number of stutends is likely much more
> larger than admins (or teachers) yet it might be the admins one to run
> away. This example simply shows how comparing siblinks highly depends
> on the way you organize the hierarchy rather than the actual memory
> consumer runaways which is the primary goal of the OOM killer to handle.
>
> > This has been Roman's design from the very beginning.
>
> I suspect this was the case because deeper hierarchies for
> organizational purposes haven't been considered.
>
> > If the preference is to kill
> > the single largest process, which may be attached to either subtree, you
> > would not have opted-in to the new heuristic.
>
> I believe you are making a wrong assumption here. The container cleanup
> is sound reason to opt in and deeper hierarchies are simply required in
> the cgroup v2 world where you do not have separate hierarchies.
>
> > > > Then, once a cgroup has been chosen as the victim cgroup,
> > > > kill the process with the highest badness, allowing the user to influence
> > > > that with /proc/pid/oom_score_adj just as today, if group_oom is disabled;
> > > > otherwise, kill all eligible processes if enabled.
> > >
> > > And now, what should be the semantic of group_oom on an intermediate
> > > (non-leaf) memcg? Why should we compare it to other killable entities?
> > > Roman was mentioning a setup where a _single_ workload consists of a
> > > deeper hierarchy which has to be shut down at once. It absolutely makes
> > > sense to consider the cumulative memory of that hierarchy when we are
> > > going to kill it all.
> > >
> >
> > If group_oom is enabled on an intermediate memcg, I think the intuitive
> > way to handle it would be that all descendants are also implicitly or
> > explicitly group_oom.
>
> This is an interesting point. I would tend to agree here. If somebody
> requires all-in clean up up the hierarchy it feels strange that a
> subtree would disagree (e.g. during memcg oom on the subtree). I can
> hardly see a usecase that would really need a different group_oom policy
> depending on where in the hierarchy the oom happened to be honest.
> Roman?
Yes, I'd say that it's strange to apply settings from outside the OOMing
cgroup to the subtree, but actually it's not. The oom_group setting should
basically mean that the OOM killer will not kill a random task in the subtree.
And it doesn't matter if it was global or memcg-wide OOM.
Applied to v9. Thanks!
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-09-27 9:58 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-11 13:17 [v8 0/4] cgroup-aware OOM killer Roman Gushchin
2017-09-11 13:17 ` [v8 1/4] mm, oom: refactor the oom_kill_process() function Roman Gushchin
2017-09-11 20:51 ` David Rientjes
2017-09-14 13:42 ` Michal Hocko
2017-09-11 13:17 ` [v8 2/4] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-09-13 20:46 ` David Rientjes
2017-09-13 21:59 ` Roman Gushchin
2017-09-11 13:17 ` [v8 3/4] mm, oom: add cgroup v2 mount option for " Roman Gushchin
2017-09-11 20:48 ` David Rientjes
2017-09-12 20:01 ` Roman Gushchin
2017-09-12 20:23 ` David Rientjes
2017-09-13 12:23 ` Michal Hocko
2017-09-11 13:17 ` [v8 4/4] mm, oom, docs: describe the " Roman Gushchin
2017-09-11 20:44 ` [v8 0/4] " David Rientjes
2017-09-13 12:29 ` Michal Hocko
2017-09-13 20:46 ` David Rientjes
2017-09-14 13:34 ` Michal Hocko
2017-09-14 20:07 ` David Rientjes
2017-09-13 21:56 ` Roman Gushchin
2017-09-14 13:40 ` Michal Hocko
2017-09-14 16:05 ` Roman Gushchin
2017-09-15 10:58 ` Michal Hocko
2017-09-15 15:23 ` Roman Gushchin
2017-09-15 19:55 ` David Rientjes
2017-09-15 21:08 ` Roman Gushchin
2017-09-18 6:20 ` Michal Hocko
2017-09-18 15:02 ` Roman Gushchin
2017-09-21 8:30 ` David Rientjes
2017-09-19 20:54 ` David Rientjes
2017-09-20 22:24 ` Roman Gushchin
2017-09-21 8:27 ` David Rientjes
2017-09-18 6:16 ` Michal Hocko
2017-09-19 20:51 ` David Rientjes
2017-09-18 6:14 ` Michal Hocko
2017-09-20 21:53 ` Roman Gushchin
2017-09-25 12:24 ` Michal Hocko
2017-09-25 17:00 ` Johannes Weiner
2017-09-25 18:15 ` Roman Gushchin
2017-09-25 20:25 ` Michal Hocko
2017-09-26 10:59 ` Roman Gushchin
2017-09-26 11:21 ` Michal Hocko
2017-09-26 12:13 ` Roman Gushchin
2017-09-26 13:30 ` Michal Hocko
2017-09-26 17:26 ` Johannes Weiner
2017-09-27 3:37 ` Tim Hockin
2017-09-27 7:43 ` Michal Hocko
2017-09-27 10:19 ` Roman Gushchin
2017-09-27 15:35 ` Tim Hockin
2017-09-27 16:23 ` Roman Gushchin
2017-09-27 18:11 ` Tim Hockin
2017-10-01 23:29 ` Shakeel Butt
2017-10-02 11:56 ` Tetsuo Handa
2017-10-02 12:24 ` Michal Hocko
2017-10-02 12:47 ` Roman Gushchin
2017-10-02 14:29 ` Michal Hocko
2017-10-02 19:00 ` Shakeel Butt
2017-10-02 19:28 ` Michal Hocko
2017-10-02 19:45 ` Shakeel Butt
2017-10-02 19:56 ` Michal Hocko
2017-10-02 20:00 ` Tim Hockin
2017-10-02 20:08 ` Michal Hocko
2017-10-02 20:09 ` Shakeel Butt
2017-10-02 20:20 ` Shakeel Butt
2017-10-02 20:24 ` Shakeel Butt
2017-10-02 20:34 ` Johannes Weiner
2017-10-02 20:55 ` Michal Hocko
2017-09-25 22:21 ` David Rientjes
2017-09-26 8:46 ` Michal Hocko
2017-09-26 21:04 ` David Rientjes
2017-09-27 7:37 ` Michal Hocko
2017-09-27 9:57 ` Roman Gushchin [this message]
2017-09-21 14:21 ` Johannes Weiner
2017-09-21 21:17 ` David Rientjes
2017-09-21 21:51 ` Johannes Weiner
2017-09-22 20:53 ` David Rientjes
2017-09-22 15:44 ` Tejun Heo
2017-09-22 20:39 ` David Rientjes
2017-09-22 21:05 ` Tejun Heo
2017-09-23 8:16 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170927095756.GA4159@castle \
--to=guro@fb.com \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).