From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941176AbcIZMKa (ORCPT ); Mon, 26 Sep 2016 08:10:30 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:47354 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933347AbcIZMK3 (ORCPT ); Mon, 26 Sep 2016 08:10:29 -0400 Date: Mon, 26 Sep 2016 14:10:25 +0200 From: Peter Zijlstra To: Christian Borntraeger Cc: Yuyang Du , Ingo Molnar , Linux Kernel Mailing List , vincent.guittot@linaro.org, Morten.Rasmussen@arm.com, dietmar.eggemann@arm.com, pjt@google.com, bsegall@google.com Subject: Re: group scheduler regression since 4.3 (bisect 9d89c257d sched/fair: Rewrite runnable load and utilization average tracking) Message-ID: <20160926121025.GC5016@twins.programming.kicks-ass.net> References: <45222b6f-4849-f1f4-fdf5-2a26ac9a3ed4@de.ibm.com> <20160926105621.GZ5016@twins.programming.kicks-ass.net> <20160926115300.GA5016@twins.programming.kicks-ass.net> <4c4e8838-9a6a-62b9-a8b7-48e4d375604e@de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4c4e8838-9a6a-62b9-a8b7-48e4d375604e@de.ibm.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 26, 2016 at 02:01:43PM +0200, Christian Borntraeger wrote: > They applied ok on next from 9/13. Things go even worse. > With this host configuration: > > CPU NODE BOOK SOCKET CORE L1d:L1i:L2d:L2i ONLINE CONFIGURED ADDRESS > 0 0 0 0 0 0:0:0:0 yes yes 0 > 1 0 0 0 0 1:1:1:1 yes yes 1 > 2 0 0 0 1 2:2:2:2 yes yes 2 > 3 0 0 0 1 3:3:3:3 yes yes 3 > 4 0 0 1 2 4:4:4:4 yes yes 4 > 5 0 0 1 2 5:5:5:5 yes yes 5 > 6 0 0 1 3 6:6:6:6 yes yes 6 > 7 0 0 1 3 7:7:7:7 yes yes 7 > 8 0 0 1 4 8:8:8:8 yes yes 8 > 9 0 0 1 4 9:9:9:9 yes yes 9 > 10 0 0 1 5 10:10:10:10 yes yes 10 > 11 0 0 1 5 11:11:11:11 yes yes 11 > 12 0 0 1 6 12:12:12:12 yes yes 12 > 13 0 0 1 6 13:13:13:13 yes yes 13 > 14 0 0 1 7 14:14:14:14 yes yes 14 > 15 0 0 1 7 15:15:15:15 yes yes 15 > > the guest was running either on 0-3 or on 4-15, but never > used the full system. With group scheduling disabled everything was good > again. So looks like that this bug has also some dependency on on the > host topology. OK, so CPU affinities that unevenly straddle topology boundaries like that are hard (and is generally not recommended), but its not immediately obvious why it would be so much worse with cgroups enabled.