From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751186AbdFATzr convert rfc822-to-8bit (ORCPT ); Thu, 1 Jun 2017 15:55:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39828 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751024AbdFATzp (ORCPT ); Thu, 1 Jun 2017 15:55:45 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3BDCF7FD52 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=longman@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 3BDCF7FD52 Subject: Re: [RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics To: Peter Zijlstra , Tejun Heo References: <1494855256-12558-1-git-send-email-longman@redhat.com> <1494855256-12558-12-git-send-email-longman@redhat.com> <20170519202624.GA15279@wtj.duckdns.org> <20170524203616.GO24798@htj.duckdns.org> <9b147a7e-fec3-3b78-7587-3890efcd42f2@redhat.com> <20170524212745.GP24798@htj.duckdns.org> <20170601145042.GA3494@htj.duckdns.org> <20170601151045.xhsv7jauejjis3mi@hirez.programming.kicks-ass.net> Cc: Li Zefan , Johannes Weiner , Ingo Molnar , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, kernel-team@fb.com, pjt@google.com, luto@amacapital.net, efault@gmx.de From: Waiman Long Organization: Red Hat Message-ID: Date: Thu, 1 Jun 2017 15:55:41 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8BIT X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Thu, 01 Jun 2017 19:55:45 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/01/2017 02:44 PM, Waiman Long wrote: > On 06/01/2017 11:10 AM, Peter Zijlstra wrote: >> On Thu, Jun 01, 2017 at 10:50:42AM -0400, Tejun Heo wrote: >>> Hello, Waiman. >>> >>> A short update. I tried making root special while keeping the >>> existing threaded semantics but I didn't really like it because we >>> have to couple controller enables/disables with threaded >>> enables/disables. I'm now trying a simpler, albeit a bit more >>> tedious, approach which should leave things mostly symmetrical. I'm >>> hoping to be able to post mostly working patches this week. >> I've not had time to look at any of this. But the question I'm most >> curious about is how cgroup-v2 preserves the container invariant. >> >> That is, each container (namespace) should look like a 'real' machine. >> So just like userns allows to have a uid-0 (aka root) for each container >> and pidns allows a pid-1 for each container, cgroupns should provide a >> root group for each container. >> >> And cgroup-v2 has this 'exception' (aka wart) for the root group which >> needs to be replicated for each namespace. > One of the changes that I proposed in my patches was to get rid of the > no internal process constraint. I think that will solve a big part of > the container invariant problem that we have with cgroup v2. > > Cheers, > Longman Another idea that I have to further solve this container invariant problem is do a cgroup setup like CP -- CR CP - container parent belong to the host CR - container root We can enable the pass-through mode at the subtree_control file of CP to force all CR controllers in pass-through mode. In this case, those controllers are not enabled in the CR like the root. However, the container can enable those in the child cgroups just like the root controller. By enabling those controller in the CP level, the host can control how much resource is being allowed in the container without the container being aware that its resources are being controlled as all the control knobs will show up in the CP, but not in CR. Cheers, Longman