From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f197.google.com (mail-qt0-f197.google.com [209.85.216.197]) by kanga.kvack.org (Postfix) with ESMTP id DB94F6B0010 for ; Wed, 1 Aug 2018 13:48:06 -0400 (EDT) Received: by mail-qt0-f197.google.com with SMTP id l23-v6so16478390qtp.1 for ; Wed, 01 Aug 2018 10:48:06 -0700 (PDT) Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id 39-v6sor8411520qtm.155.2018.08.01.10.48.03 for (Google Transport Security); Wed, 01 Aug 2018 10:48:03 -0700 (PDT) Date: Wed, 1 Aug 2018 13:50:57 -0400 From: Johannes Weiner Subject: Re: [PATCH 3/3] mm, oom: introduce memory.oom.group Message-ID: <20180801175057.GD11386@cmpxchg.org> References: <20180730180100.25079-1-guro@fb.com> <20180730180100.25079-4-guro@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180730180100.25079-4-guro@fb.com> Sender: owner-linux-mm@kvack.org List-ID: To: Roman Gushchin Cc: linux-mm@kvack.org, Michal Hocko , David Rientjes , Tetsuo Handa , Tejun Heo , kernel-team@fb.com, linux-kernel@vger.kernel.org On Mon, Jul 30, 2018 at 11:01:00AM -0700, Roman Gushchin wrote: > For some workloads an intervention from the OOM killer > can be painful. Killing a random task can bring > the workload into an inconsistent state. > > Historically, there are two common solutions for this > problem: > 1) enabling panic_on_oom, > 2) using a userspace daemon to monitor OOMs and kill > all outstanding processes. > > Both approaches have their downsides: > rebooting on each OOM is an obvious waste of capacity, > and handling all in userspace is tricky and requires > a userspace agent, which will monitor all cgroups > for OOMs. > > In most cases an in-kernel after-OOM cleaning-up > mechanism can eliminate the necessity of enabling > panic_on_oom. Also, it can simplify the cgroup > management for userspace applications. > > This commit introduces a new knob for cgroup v2 memory > controller: memory.oom.group. The knob determines > whether the cgroup should be treated as a single > unit by the OOM killer. If set, the cgroup and its > descendants are killed together or not at all. > > To determine which cgroup has to be killed, we do > traverse the cgroup hierarchy from the victim task's > cgroup up to the OOMing cgroup (or root) and looking > for the highest-level cgroup with memory.oom.group set. > > Tasks with the OOM protection (oom_score_adj set to -1000) > are treated as an exception and are never killed. > > This patch doesn't change the OOM victim selection algorithm. > > Signed-off-by: Roman Gushchin > Cc: Michal Hocko > Cc: Johannes Weiner > Cc: David Rientjes > Cc: Tetsuo Handa > Cc: Tejun Heo The semantics make sense to me and the code is straight-forward. With Michal's other feedback incorporated, please feel free to add: Acked-by: Johannes Weiner