From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751786AbdCNOrt (ORCPT ); Tue, 14 Mar 2017 10:47:49 -0400 Received: from mout.gmx.net ([212.227.17.20]:60140 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751154AbdCNOqh (ORCPT ); Tue, 14 Mar 2017 10:46:37 -0400 Message-ID: <1489502742.4111.29.camel@gmx.de> Subject: Re: [PATCHSET for-4.11] cgroup: implement cgroup v2 thread mode From: Mike Galbraith To: Tejun Heo Cc: Peter Zijlstra , lizefan@huawei.com, hannes@cmpxchg.org, mingo@redhat.com, pjt@google.com, luto@amacapital.net, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com, lvenanci@redhat.com, Linus Torvalds , Andrew Morton Date: Tue, 14 Mar 2017 15:45:42 +0100 In-Reply-To: <20170313192621.GD15709@htj.duckdns.org> References: <20170203202048.GD6515@twins.programming.kicks-ass.net> <20170203205955.GA9886@mtj.duckdns.org> <20170206124943.GJ6515@twins.programming.kicks-ass.net> <20170208230819.GD25826@htj.duckdns.org> <20170209102909.GC6515@twins.programming.kicks-ass.net> <20170210154508.GA16097@mtj.duckdns.org> <20170210175145.GJ6515@twins.programming.kicks-ass.net> <20170212050544.GJ29323@mtj.duckdns.org> <1486882799.24462.25.camel@gmx.de> <1486964707.5912.93.camel@gmx.de> <20170313192621.GD15709@htj.duckdns.org> Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.16.5 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:yLwEveVpDKJjQGsJNym5YpOhJJF4RIXh9+2XIq6++YV/iQSygE4 wszHpDJ7/jlXdix3bawMpRpII8uiQoBZLTHSwm0pZUi7JQJzLiJr3/PtsjqfwhHJTNqkaJS /W3cYErSeTuLmkFOk903/at1CJjW0eSyl4mVZX5WOnF53GeOcDyVaeADp9GOe/AC4G0BrLp 8L4OF58LSTd+fJG80uDLA== X-UI-Out-Filterresults: notjunk:1;V01:K0:18ysnfK29Gk=:m/eh5X4kWsxHo2lwQ9NId/ t3DDO7hL4JA1PivEtAJzUOGyolm2RGM01f4Fxww3OUE7BYFveD2rRPmq4iRs2TrgvVePOO0Y6 TlTWt0ZOXOWCRCjC7i4YeNY8VKXdebHTqwoAXGm3hfKV4c7BcCLTv2M1wxQeLUSBm3x0M1NFK 5sircwEqMlzrifKaKRQNyJjqsQVnAuS15ABvNTkZUVtxi4gF1vXSUG0VJf2dPtCLi7muaglMy WUIlx2et214q702/NLSfSC2++djjFWAHmhsAV2tog//nvZ7QbN/Re7U+llaJqeRft9NCqX8m3 FN+xKY9AZBTR5xPbw4bTdTYWLsIzoHTpOdbj00LDHchHfMP6q9WxPWB/qGp9IrOldXohicvB3 8Hljdkve1PHk0puS8z1PVQIfmFwjV6bvWZKuQFm3ESxr2bFU2SHVF6mEVxm2vnv4SN7/HLoNQ CpLnGC4qid8jZxt4eizpuYIDnXGwIe+CuIGilByYEbNnA3+NbpLy8KMw2SfwUQSFaGyrd85ej TfoawK99k5m1njGlwr2y2BLifgK59kM4JdzDJ0jg7dl4+U5sNEGgjfvsJHqrHiwgI4N397fyS g6lK2vwHWLE7Yrw/97BCz5XNufQekIIbf29L2YTMYGhNHSFN+d9BeetuIKwuhTweu+39gAOhv amzQ1aIv5uPjmebYyycuIKQGB+3yaHoHHBlxBK2tzu5EiedqfTxqmFPMHdGTvUBpuBAkjaWlr l6MsNmnUl3rNifh9QaJArMb6ZtyAlFZaDw5Q+cKvm35Hhwbmo6om0E7B9wlhELJ9SnyqVFdFT zQFMXX9 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2017-03-13 at 15:26 -0400, Tejun Heo wrote: > Hello, Mike. > > Sorry about the long delay. > > On Mon, Feb 13, 2017 at 06:45:07AM +0100, Mike Galbraith wrote: > > > > So, as long as the depth stays reasonable (single digit or lower), > > > > what we try to do is keeping tree traversal operations aggregated or > > > > located on slow paths. There still are places that this overhead > > > > shows up (e.g. the block controllers aren't too optimized) but it > > > > isn't particularly difficult to make a handful of layers not matter at > > > > all. > > > > > > A handful of cpu bean counting layers stings considerably. > > Hmm... yeah, I was trying to think about ways to avoid full scheduling > overhead at each layer (the scheduler does a lot per each layer of > scheduling) but don't think it's possible to circumvent that without > introducing a whole lot of scheduling artifacts. Yup. > In a lot of workloads, the added overhead from several layers of CPU > controllers doesn't seem to get in the way too much (most threads do > something other than scheduling after all). Sure, don't schedule a lot, it doesn't hurt much, but there are plenty of loads that routinely do schedule a LOT, and there it matters a LOT.. which is why network benchmarks tend to be severely allergic to scheduler lard. > The only major issue that > we're seeing in the fleet is the cgroup iteration in idle rebalancing > code pushing up the scheduling latency too much but that's a different > issue. Hm, I would suspect PELT to be the culprit there. It helps smooth out load balancing, but will stack "skinny looking" tasks. -Mike