All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: David Rientjes <rientjes@google.com>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@kernel.org,
	hannes@cmpxchg.org, tj@kernel.org, gthelen@google.com
Subject: Re: cgroup-aware OOM killer, how to move forward
Date: Fri, 13 Jul 2018 15:16:07 -0700	[thread overview]
Message-ID: <20180713221602.GA15005@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <alpine.DEB.2.21.1807131423230.194789@chino.kir.corp.google.com>

On Fri, Jul 13, 2018 at 02:34:49PM -0700, David Rientjes wrote:
> On Wed, 11 Jul 2018, Roman Gushchin wrote:
> 
> > I was thinking on how to move forward with the cgroup-aware OOM killer.
> > It looks to me, that we all agree on the "cleanup" part of the patchset:
> > it's a nice feature to be able to kill all tasks in the cgroup
> > to guarantee the consistent state of the workload.
> > All our disagreements are related to the victim selection algorithm.
> > 
> > So, I wonder, if the right thing to do is to split the problem.
> > We can agree on the "cleanup" part, which is useful by itself,
> > merge it upstream, and then return to the victim selection
> > algorithm.
> > 
> > So, here is my proposal:
> > let's introduce the memory.group_oom knob with the following semantics:
> > if the knob is set, the OOM killer can kill either none, either all
> > tasks in the cgroup*.
> > It can perfectly work with the current OOM killer (as a "cleanup" option),
> > and allows _any_ further approach on the OOM victim selection.
> > It also doesn't require any mount/boot/tree-wide options.
> > 
> > How does it sound?
> > 
> 
> No objection, of course, this was always the mechanism vs policy 
> separation that I was referring to.  Having the ability to kill all 
> processes attached to the cgroup when one of its processes is selected is 
> useful, and we have our own patches that do just that, with the exception 
> that it's triggerable by the user.

Perfect! I'll prepare the patchset.

> 
> One of the things that I really like about cgroup v2, though, is what 
> appears to be an implicit, but rather apparent, goal to minimize the 
> number of files for each controller.  It's very clean.  So I'd suggest 
> that we consider memory.group_oom, or however it is named, to allow for 
> future development.
> 
> For example, rather than simply being binary, we'd probably want the 
> ability to kill all eligible processes attached directly to the victim's 
> mem cgroup *or* all processes attached to its subtree as well.
> 
> I'd suggest it be implemented to accept a string, "default"/"process", 
> "local" or "tree"/"hierarchy", or better names, to define the group oom 
> mechanism for the mem cgroup that is oom when one of its processes is 
> selected as a victim.

I would prefer to keep it boolean to match the simplicity of cgroup v2 API.
In v2 hierarchy processes can't be attached to non-leaf cgroups,
so I don't see the place for the 3rd meaning.

> 
> > * More precisely: if the OOM killer kills a task,
> > it will traverse the cgroup tree up to the OOM domain (OOMing memcg or root),
> > looking for the highest-level cgroup with group_oom set. Then it will
> > kill all tasks in such cgroup, if it does exist.
> > 
> 
> All such processes that are not oom disabled, yes.
> 

Yep, of course.

Thanks!

  reply	other threads:[~2018-07-13 22:16 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 22:40 cgroup-aware OOM killer, how to move forward Roman Gushchin
2018-07-12 12:07 ` Michal Hocko
2018-07-12 15:55   ` Roman Gushchin
2018-07-13 21:34 ` David Rientjes
2018-07-13 22:16   ` Roman Gushchin [this message]
2018-07-13 22:39     ` David Rientjes
2018-07-13 23:05       ` Roman Gushchin
2018-07-13 23:11         ` David Rientjes
2018-07-13 23:16           ` Roman Gushchin
2018-07-17  4:19             ` David Rientjes
2018-07-17 12:41               ` Michal Hocko
2018-07-17 17:38               ` Roman Gushchin
2018-07-17 19:49                 ` Michal Hocko
2018-07-17 20:06                   ` Roman Gushchin
2018-07-17 20:41                     ` David Rientjes
2018-07-17 20:52                       ` Roman Gushchin
2018-07-20  8:30                         ` David Rientjes
2018-07-20 11:21                           ` Tejun Heo
2018-07-20 16:13                             ` Roman Gushchin
2018-07-20 20:28                             ` David Rientjes
2018-07-20 20:47                               ` Roman Gushchin
2018-07-23 23:06                                 ` David Rientjes
2018-07-23 14:12                               ` Michal Hocko
2018-07-18  8:19                       ` Michal Hocko
2018-07-18  8:12                     ` Michal Hocko
2018-07-18 15:28                       ` Roman Gushchin
2018-07-19  7:38                         ` Michal Hocko
2018-07-19 17:05                           ` Roman Gushchin
2018-07-20  8:32                             ` David Rientjes
2018-07-23 14:17                             ` Michal Hocko
2018-07-23 15:09                               ` Tejun Heo
2018-07-24  7:32                                 ` Michal Hocko
2018-07-24 13:08                                   ` Tejun Heo
2018-07-24 13:26                                     ` Michal Hocko
2018-07-24 13:31                                       ` Tejun Heo
2018-07-24 13:50                                         ` Michal Hocko
2018-07-24 13:55                                           ` Tejun Heo
2018-07-24 14:25                                             ` Michal Hocko
2018-07-24 14:28                                               ` Tejun Heo
2018-07-24 14:35                                                 ` Tejun Heo
2018-07-24 14:43                                                 ` Michal Hocko
2018-07-24 14:49                                                   ` Tejun Heo
2018-07-24 15:52                                                     ` Roman Gushchin
2018-07-25 12:00                                                       ` Michal Hocko
2018-07-25 11:58                                                     ` Michal Hocko
2018-07-30  8:03                                       ` Michal Hocko
2018-07-30 14:04                                         ` Tejun Heo
2018-07-30 15:29                                           ` Roman Gushchin
2018-07-24 11:59 ` Tetsuo Handa
2018-07-25  0:10   ` Roman Gushchin
2018-07-25 12:23     ` Tetsuo Handa
2018-07-25 13:01       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180713221602.GA15005@castle.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.