From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755438Ab1ILMgR (ORCPT ); Mon, 12 Sep 2011 08:36:17 -0400 Received: from casper.infradead.org ([85.118.1.10]:49710 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752562Ab1ILMgQ convert rfc822-to-8bit (ORCPT ); Mon, 12 Sep 2011 08:36:16 -0400 Subject: Re: CFS Bandwidth Control - Test results of cgroups tasks pinned vs unpinnede From: Peter Zijlstra To: Srivatsa Vaddagiri Cc: Paul Turner , Kamalesh Babulal , Vladimir Davydov , "linux-kernel@vger.kernel.org" , Bharata B Rao , Dhaval Giani , Vaidyanathan Srinivasan , Ingo Molnar , Pavel Emelianov Date: Mon, 12 Sep 2011 14:35:43 +0200 In-Reply-To: <20110912101722.GA28950@linux.vnet.ibm.com> References: <20110608163234.GA23031@linux.vnet.ibm.com> <20110610181719.GA30330@linux.vnet.ibm.com> <20110615053716.GA390@linux.vnet.ibm.com> <20110907152009.GA3868@linux.vnet.ibm.com> <1315423342.11101.25.camel@twins> <20110908151433.GB6587@linux.vnet.ibm.com> <1315571462.26517.9.camel@twins> <20110912101722.GA28950@linux.vnet.ibm.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT X-Mailer: Evolution 3.0.2- Message-ID: <1315830943.26517.36.camel@twins> Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2011-09-12 at 15:47 +0530, Srivatsa Vaddagiri wrote: > * Peter Zijlstra [2011-09-09 14:31:02]: > > > > Machine : 16-cpus (2 Quad-core w/ HT enabled) > > > Cgroups : 5 in number (C1-C5), each having {2, 2, 4, 8, 16} tasks respectively. > > > Further, each task is placed in its own (sub-)cgroup with > > > a capped usage of 50% CPU. > > > > So that's loads: {512,512}, {512,512}, {256,256,256,256}, {128,..} and {64,..} > > Yes, with the default shares of 1024 for each cgroup. > > FWIW we did also try setting shares for each cgroup proportional to number of > tasks it has. For ex: C1's shares = 1024 * 2 = 2048, C2 = 1024 * 2 = 2048, > C3 = 4 * 1024 = 4096 etc. while /C1/C1_1, /C1/C1_2, .../C5/C5_16/ shares were > left at default of 1024 (as those sub-cgroups contain only one task). > > That does help reduce idle time by almost 50% (from 15-20% -> 6-9%) Of course it does.. and I bet you can improve that slightly if you manage to fix some of the numerical nightmares that live in the cgroup load-balancer (Paul, care to share your WIP?) But the initial scenario is a complete and utter fail, its impossible to schedule that sanely. Its an infeasible weight scenario with more tasks than cpus, and the added bandwidth constraints just keep changing the set requiring endless migrations to try and keep utilization from tanking. Really, classic fail.