From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751238AbdFAUsy (ORCPT ); Thu, 1 Jun 2017 16:48:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59174 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751140AbdFAUsv (ORCPT ); Thu, 1 Jun 2017 16:48:51 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 51843B1E24 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=longman@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 51843B1E24 Subject: Re: [RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics To: Tejun Heo References: <20170519202624.GA15279@wtj.duckdns.org> <20170524203616.GO24798@htj.duckdns.org> <9b147a7e-fec3-3b78-7587-3890efcd42f2@redhat.com> <20170524212745.GP24798@htj.duckdns.org> <20170601145042.GA3494@htj.duckdns.org> <20170601151045.xhsv7jauejjis3mi@hirez.programming.kicks-ass.net> <20170601184740.GC3494@htj.duckdns.org> <20170601203815.GA13390@htj.duckdns.org> Cc: Peter Zijlstra , Li Zefan , Johannes Weiner , Ingo Molnar , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, kernel-team@fb.com, pjt@google.com, luto@amacapital.net, efault@gmx.de From: Waiman Long Organization: Red Hat Message-ID: Date: Thu, 1 Jun 2017 16:48:48 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170601203815.GA13390@htj.duckdns.org> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 01 Jun 2017 20:48:50 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/01/2017 04:38 PM, Tejun Heo wrote: > Hello, > > On Thu, Jun 01, 2017 at 03:27:35PM -0400, Waiman Long wrote: >> As said in an earlier email, I agreed that masking it on the kernel side >> may not be the best solution. I offer 2 other alternatives: >> 1) Document on how to work around the resource domains issue by proper >> setup of the cgroup hierarchy. > We can definitely improve documentation. > >> 2) Mark those controllers that require the no internal process >> competition constraint and disallow internal process only when those >> controllers are active. > We *can* do that but wouldn't this be equivalent to enabling thread > mode implicitly when only thread aware controllers are enabled? > >> I prefer the first alternative, but I can go with the second if necessary. >> >> The major rationale behind my enhanced thread mode patch was to allow >> something like >> >> R -- A -- B >> \ >> T1 -- T2 >> >> where you can have resource domain controllers enabled in the thread >> root as well as some child cgroups of the thread root. As no internal >> process rule is currently not applicable to the thread root, this >> creates the dilemma that we need to deal with internal process competition. >> >> The container invariant that PeterZ talked about will also be a serious >> issue here as I don't think we are going to set up a container root >> cgroup that will have no process allowed in it because it has some child >> cgroups. IMHO, I don't think cgroup v2 will get wide adoption without >> getting rid of that no internal process constraint. > The only thing which is necessary from inside a container is putting > the management processes into their own cgroups so that they can be > controlled (ie. the same thing you did with your patch but doing that > explicitly from userland) and userland management sw can do the same > thing whether it's inside a container or on a bare system. BTW, > systemd already does so and works completely fine in terms of > containerization on cgroup2. It is arguable whether we should make > this more convenient from kernel side but using cgroup2 for resource > control already requires the userspace tools to be adapted to it, so > I'm not sure how much benefit we'd gain from adding that compared to > explicitly documenting it. I think we are on agreement here. I should we should just document how userland can work around the internal process competition issue by setting up the cgroup hierarchy properly. Then we can remove the no internal process constraint. Cheers, Longman