From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f50.google.com (mail-pa0-f50.google.com [209.85.220.50]) by kanga.kvack.org (Postfix) with ESMTP id 1B8B56B0035 for ; Tue, 29 Apr 2014 13:31:12 -0400 (EDT) Received: by mail-pa0-f50.google.com with SMTP id rd3so541306pab.23 for ; Tue, 29 Apr 2014 10:31:10 -0700 (PDT) Received: from aserp1040.oracle.com (aserp1040.oracle.com. [141.146.126.69]) by mx.google.com with ESMTPS id yd10si13634719pab.371.2014.04.29.10.31.09 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 29 Apr 2014 10:31:09 -0700 (PDT) Date: Tue, 29 Apr 2014 13:30:39 -0400 From: Dwight Engen Subject: Re: Protection against container fork bombs [WAS: Re: memcg with kmem limit doesn't recover after disk i/o causes limit to be hit] Message-ID: <20140429133039.162d9dd7@oracle.com> In-Reply-To: <20140429170639.GA25609@dhcp22.suse.cz> References: <20140422200531.GA19334@alpha.arachsys.com> <535758A0.5000500@yuhu.biz> <20140423084942.560ae837@oracle.com> <20140428180025.GC25689@ubuntumail> <20140429072515.GB15058@dhcp22.suse.cz> <20140429130353.GA27354@ubuntumail> <20140429154345.GH15058@dhcp22.suse.cz> <20140429165114.GE6129@localhost.localdomain> <20140429170639.GA25609@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Tim Hockin , Richard Davies , Vladimir Davydov , David Rientjes , Marian Marinov , Max Kellermann , Tim Hockin , Frederic Weisbecker , containers@lists.linux-foundation.org, Serge Hallyn , Glauber Costa , "linux-mm@kvack.org" , William Dauchy , Johannes Weiner , Tejun Heo , cgroups@vger.kernel.org, Daniel Walsh On Tue, 29 Apr 2014 19:06:39 +0200 Michal Hocko wrote: > On Tue 29-04-14 09:59:30, Tim Hockin wrote: > > Here's the reason it doesn't work for us: It doesn't work. > > There is a "simple" solution for that. Help us to fix it. > > > It was something like 2 YEARS since we first wanted this, and it > > STILL does not work. > > My recollection is that it was primarily Parallels and Google asking > for the kmem accounting. The reason why I didn't fight against > inclusion although the implementation at the time didn't have a > proper slab shrinking implemented was that that would happen later. > Well, that later hasn't happened yet and we are slowly getting there. > > > You're postponing a pretty simple request indefinitely in > > favor of a much more complex feature, which still doesn't really > > give me what I want. > > But we cannot simply add a new interface that will have to be > maintained for ever just because something else that is supposed to > workaround bugs. > > > What I want is an API that works like rlimit but per-cgroup, rather > > than per-UID. > > You can use an out-of-tree patchset for the time being or help to get > kmem into shape. If there are principal reasons why kmem cannot be > used then you better articulate them. Is there a plan to separately account/limit stack pages vs kmem in general? Richard would have to verify, but I suspect kmem is not currently viable as a process limiter for him because icache/dcache/stack is all accounted together. > > On Tue, Apr 29, 2014 at 9:51 AM, Frederic Weisbecker > > wrote: > > > On Tue, Apr 29, 2014 at 09:06:22AM -0700, Tim Hockin wrote: > > >> Why the insistence that we manage something that REALLY IS a > > >> first-class concept (hey, it has it's own RLIMIT) as a side > > >> effect of something that doesn't quite capture what we want to > > >> achieve? > > > > > > It's not a side effect, the kmem task stack control was partly > > > motivated to solve forkbomb issues in containers. > > > > > > Also in general if we can reuse existing features and code to > > > solve a problem without disturbing side issues, we just do it. > > > > > > Now if kmem doesn't solve the issue for you for any reason, or it > > > does but it brings other problems that aren't fixable in kmem > > > itself, we can certainly reconsider this cgroup subsystem. But I > > > haven't yet seen argument of this kind yet. > > > > > >> > > >> Is there some specific technical reason why you think this is a > > >> bad idea? > > >> I would think, especially in a more unified hierarchy world, > > >> that more cgroup controllers with smaller sets of responsibility > > >> would make for more manageable code (within limits, obviously). > > > > > > Because it's core code and it adds complications and overhead in > > > the fork/exit path. We just don't add new core code just for the > > > sake of slightly prettier interfaces. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org