From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1755949AbcEQQRE (ORCPT <rfc822;w@1wt.eu>);
	Tue, 17 May 2016 12:17:04 -0400
Received: from smtpoutz28.laposte.net ([194.117.213.103]:39719 "EHLO
	smtp.laposte.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org
	with ESMTP id S1751829AbcEQQRD (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 17 May 2016 12:17:03 -0400
Message-ID: <573B43FA.7080503@laposte.net>
Date: Tue, 17 May 2016 18:16:58 +0200
From: Sebastian Frias <sf84@laposte.net>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: Michal Hocko <mhocko@kernel.org>
CC: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>,
        Mason <slash.tmp@free.fr>, linux-mm@kvack.org,
        Andrew Morton <akpm@linux-foundation.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        LKML <linux-kernel@vger.kernel.org>, bsingharora@gmail.com
Subject: Re: [PATCH] mm: add config option to select the initial overcommit
 mode
References: <573593EE.6010502@free.fr> <20160513095230.GI20141@dhcp22.suse.cz> <5735AA0E.5060605@free.fr> <20160513114429.GJ20141@dhcp22.suse.cz> <5735C567.6030202@free.fr> <20160513140128.GQ20141@dhcp22.suse.cz> <20160513160410.10c6cea6@lxorguk.ukuu.org.uk> <5735F4B1.1010704@laposte.net> <20160513164357.5f565d3c@lxorguk.ukuu.org.uk> <573AD534.6050703@laposte.net> <20160517085724.GD14453@dhcp22.suse.cz>
In-Reply-To: <20160517085724.GD14453@dhcp22.suse.cz>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
X-VR-SrcIP: 83.142.147.193
X-VR-FullState: 0
X-VR-Score: -100
X-VR-Cause-1: gggruggvucftvghtrhhoucdtuddrfeekledrvdekgdelkecutefuodetggdotefrodftvfcurfhrohhf
X-VR-Cause-2: ihhlvgemucfntefrqffuvffgnecuuegrihhlohhuthemucehtddtnecusecvtfgvtghiphhivghnthhs
X-VR-Cause-3: ucdlqddutddtmdenucfjughrpefkfffhfgggvffufhgjtgfgsehtjegrtddtfeehnecuhfhrohhmpefu
X-VR-Cause-4: vggsrghsthhirghnucfhrhhirghsuceoshhfkeegsehlrghpohhsthgvrdhnvghtqeenucfkphepkeef
X-VR-Cause-5: rddugedvrddugeejrdduleefnecurfgrrhgrmhepmhhouggvpehsmhhtphhouhhtpdhhvghloheplgdu
X-VR-Cause-6: jedvrddvjedrtddrvddugegnpdhinhgvthepkeefrddugedvrddugeejrdduleefpdhmrghilhhfrhho
X-VR-Cause-7: mhepshhfkeegsehlrghpohhsthgvrdhnvghtpdhrtghpthhtohepmhhhohgtkhhosehkvghrnhgvlhdr
X-VR-Cause-8: ohhrgh
X-VR-AvState: No
X-VR-State: 0
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hi Michal,

On 05/17/2016 10:57 AM, Michal Hocko wrote:
> On Tue 17-05-16 10:24:20, Sebastian Frias wrote:
> [...]
>>>> Also, under what conditions would copy-on-write fail?
>>>
>>> When you have no memory or swap pages free and you touch a COW page that
>>> is currently shared. At that point there is no resource to back to the
>>> copy so something must die - either the process doing the copy or
>>> something else.
>>
>> Exactly, and why does "killing something else" makes more sense (or
>> was chosen over) "killing the process doing the copy"?
> 
> Because that "something else" is usually a memory hog and so chances are
> that the out of memory situation will get resolved. If you kill "process
> doing the copy" then you might end up just not getting any memory back
> because that might be a little forked process which doesn't own all that
> much memory on its own. That would leave you in the oom situation for a
> long time until somebody actually sitting on some memory happens to ask
> for CoW... See the difference?
> 

I see the difference, your answer seems a bit like the one from Austin, basically:
- killing a process is a sort of kernel protection attempting to deal "automatically" with some situation, like deciding what is a 'memory hog', or what is 'in infinite loop', "usually" in a correct way.
It seems there's people who think its better to avoid having to take such decisions and/or they should be decided by the user, because "usually" != "always".
And people who see that as a nice thing but complex thing to do.
In this thread we've tried to explain why this heuristic (and/or OOM-killer) is/was needed and/or its history, which has been very enlightening by the way.

>>From reading Documentation/cgroup-v1/memory.txt (and from a few replies here talking about cgroups), it looks like the OOM-killer is still being actively discussed, well, there's also "cgroup-v2".
My understanding is that cgroup's memory control will pause processes in a given cgroup until the OOM situation is solved for that cgroup, right?
If that is right, it means that there is indeed a way to deal with an OOM situation (stack expansion, COW failure, 'memory hog', etc.) in a better way than the OOM-killer, right?
In which case, do you guys know if there is a way to make the whole system behave as if it was inside a cgroup? (*)

Best regards,

Sebastian


(*): I tried setting up a simple test but failed, so I think I need more reading :-)