From: Roman Gushchin <guro@fb.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-mm@kvack.org, Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
Tejun Heo <tj@kernel.org>,
kernel-team@fb.com, cgroups@vger.kernel.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [v4 4/4] mm, oom, docs: describe the cgroup-aware OOM killer
Date: Mon, 14 Aug 2017 13:28:32 +0100 [thread overview]
Message-ID: <20170814122832.GB24393@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1708081615110.54505@chino.kir.corp.google.com>
On Tue, Aug 08, 2017 at 04:24:32PM -0700, David Rientjes wrote:
> On Wed, 26 Jul 2017, Roman Gushchin wrote:
>
> > +Cgroup-aware OOM Killer
> > +~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Cgroup v2 memory controller implements a cgroup-aware OOM killer.
> > +It means that it treats memory cgroups as first class OOM entities.
> > +
> > +Under OOM conditions the memory controller tries to make the best
> > +choise of a victim, hierarchically looking for the largest memory
> > +consumer. By default, it will look for the biggest task in the
> > +biggest leaf cgroup.
> > +
> > +Be default, all cgroups have oom_priority 0, and OOM killer will
> > +chose the largest cgroup recursively on each level. For non-root
> > +cgroups it's possible to change the oom_priority, and it will cause
> > +the OOM killer to look athe the priority value first, and compare
> > +sizes only of cgroups with equal priority.
> > +
> > +But a user can change this behavior by enabling the per-cgroup
> > +oom_kill_all_tasks option. If set, it causes the OOM killer treat
> > +the whole cgroup as an indivisible memory consumer. In case if it's
> > +selected as on OOM victim, all belonging tasks will be killed.
> > +
> > +Tasks in the root cgroup are treated as independent memory consumers,
> > +and are compared with other memory consumers (e.g. leaf cgroups).
> > +The root cgroup doesn't support the oom_kill_all_tasks feature.
> > +
> > +This affects both system- and cgroup-wide OOMs. For a cgroup-wide OOM
> > +the memory controller considers only cgroups belonging to the sub-tree
> > +of the OOM'ing cgroup.
> > +
> > IO
> > --
>
> Thanks very much for following through with this.
>
> As described in http://marc.info/?l=linux-kernel&m=149980660611610 this is
> very similar to what we do for priority based oom killing.
>
> I'm wondering your comments on extending it one step further, however:
> include process priority as part of the selection rather than simply memcg
> priority.
>
> memory.oom_priority will dictate which memcg the kill will originate from,
> but processes have no ability to specify that they should actually be
> killed as opposed to a leaf memcg. I'm not sure how important this is for
> your usecase, but we have found it useful to be able to specify process
> priority as part of the decisionmaking.
>
> At each level of consideration, we simply kill a process with lower
> /proc/pid/oom_priority if there are no memcgs with a lower
> memory.oom_priority. This allows us to define the exact process that will
> be oom killed, absent oom_kill_all_tasks, and not require that the process
> be attached to leaf memcg. Most notably these are processes that are best
> effort: stats collection, logging, etc.
I'm focused on cgroup v2 interface, that means, that there are no processes
belonging to non-leaf cgroups. So, cgroups are compared only with root-cgroup
processes, and I'm not sure we really need a way to prioritize them.
>
> Do you think it would be helpful to introduce per-process oom priority as
> well?
I'm not against per-process oom_priority, and it might be a good idea
to replace the existing oom_score_adj with it at some point. I might be wrong,
but I think users mostly using the extereme oom_score_adj values;
no one really needs the tiebreaking based on some percentages
of the total memory. And oom_priority will be just a simpler and more clear
way to express the same intention.
But it's not directly related to this patchset, and it's more arguable,
so I think it can be done later.
prev parent reply other threads:[~2017-08-14 12:28 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20170726132718.14806-1-guro@fb.com>
2017-07-26 13:27 ` [v4 1/4] mm, oom: refactor the TIF_MEMDIE usage Roman Gushchin
2017-07-26 13:56 ` Michal Hocko
2017-07-26 14:06 ` Roman Gushchin
2017-07-26 14:24 ` Michal Hocko
2017-07-26 14:44 ` Michal Hocko
[not found] ` <20170726144408.GU2981-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-07-26 14:50 ` Roman Gushchin
2017-07-26 13:27 ` [v4 2/4] mm, oom: cgroup-aware OOM killer Roman Gushchin
[not found] ` <20170726132718.14806-3-guro-b10kYP2dOMg@public.gmane.org>
2017-07-27 21:41 ` kbuild test robot
2017-08-01 14:54 ` Michal Hocko
2017-08-01 15:25 ` Roman Gushchin
2017-08-01 17:03 ` Michal Hocko
2017-08-01 18:13 ` Roman Gushchin
2017-08-02 7:29 ` Michal Hocko
2017-08-03 12:47 ` Roman Gushchin
[not found] ` <20170803124751.GA24563-2xczL/1GIl5a1dPMsufgnw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2017-08-03 13:01 ` Michal Hocko
2017-08-08 23:06 ` David Rientjes
[not found] ` <alpine.DEB.2.10.1708081559001.54505-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2017-08-14 12:03 ` Roman Gushchin
2017-07-26 13:27 ` [v4 3/4] mm, oom: introduce oom_priority for memory cgroups Roman Gushchin
[not found] ` <20170726132718.14806-4-guro-b10kYP2dOMg@public.gmane.org>
2017-08-08 23:14 ` David Rientjes
2017-08-14 12:39 ` Roman Gushchin
2017-07-26 13:27 ` [v4 4/4] mm, oom, docs: describe the cgroup-aware OOM killer Roman Gushchin
[not found] ` <20170726132718.14806-5-guro-b10kYP2dOMg@public.gmane.org>
2017-08-08 23:24 ` David Rientjes
2017-08-14 12:28 ` Roman Gushchin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170814122832.GB24393@castle.DHCP.thefacebook.com \
--to=guro@fb.com \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@fb.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox