From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755454Ab1IMQVw (ORCPT ); Tue, 13 Sep 2011 12:21:52 -0400 Received: from e7.ny.us.ibm.com ([32.97.182.137]:37684 "EHLO e7.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755230Ab1IMQVu (ORCPT ); Tue, 13 Sep 2011 12:21:50 -0400 Date: Tue, 13 Sep 2011 21:51:19 +0530 From: Srivatsa Vaddagiri To: Peter Zijlstra Cc: Paul Turner , Kamalesh Babulal , Vladimir Davydov , "linux-kernel@vger.kernel.org" , Bharata B Rao , Dhaval Giani , Vaidyanathan Srinivasan , Ingo Molnar , Pavel Emelianov Subject: Re: CFS Bandwidth Control - Test results of cgroups tasks pinned vs unpinnede Message-ID: <20110913162119.GA3045@linux.vnet.ibm.com> Reply-To: Srivatsa Vaddagiri References: <1315423342.11101.25.camel@twins> <20110908151433.GB6587@linux.vnet.ibm.com> <1315571462.26517.9.camel@twins> <20110912101722.GA28950@linux.vnet.ibm.com> <1315830943.26517.36.camel@twins> <20110913041545.GD11100@linux.vnet.ibm.com> <20110913050306.GB7254@linux.vnet.ibm.com> <1315906788.575.3.camel@twins> <20110913112852.GE7254@linux.vnet.ibm.com> <1315922848.5977.11.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1315922848.5977.11.camel@twins> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Peter Zijlstra [2011-09-13 16:07:28]: > > > > This is perhaps not optimal (as it may lead to more lock contentions), but > > > > something to note for those who care for both capping and utilization in > > > > equal measure! > > > > > > You meant lock inversion, which leads to more idle time :-) > > > > I think 'cfs_b->lock' contention would go up significantly when reducing > > sysctl_sched_cfs_bandwidth_slice, while for something like 'balancing' lock > > (taken with SD_SERIALIZE set and more frequently when tuning down > > max_interval?), yes it may increase idle time! Did you have any other > > lock in mind when speaking of inversion? > > I can't read it seems.. I thought you were talking about increasing the > period, Mm ..I brought up the increased lock contention with reference to this experimental result that I posted earlier: > Tuning min_interval and max_interval of various sched_domains to 1 > and also setting sched_cfs_bandwidth_slice_us to 500 does cut down idle > time further to 2.7% Value of sched_cfs_bandwidth_slice_us was reduced from default of 5000us to 500us, which (along with reduction of min/max interval) helped cut down idle time further (3.9% -> 2.7%). I was commenting that this may not necessarily be optimal (as for example low 'sched_cfs_bandwidth_slice_us' could result in all cpus contending for cfs_b->lock very frequently). > which increases the time you force a task to sleep that's holding locks etc.. Ideally all tasks should get capped at the same time, given that there is a global pool from which everyone pulls bandwidth? So while one vcpu/task (holding a lock) gets capped, other vcpus/tasks (that may want the same lock) should ideally not be running for long after that, avoiding lock inversion related problems you point out. I guess that we may still run into that with current implementation .. Basically global pool may have zero runtime left for current period, forcing a vcpu/task to be throttled, while there is surplus runtime in per-cpu pools, allowing some sibling vcpus/tasks to run for wee bit more, leading to lock-inversion related problems (more idling). That makes me think we can improve directed yield->capping interaction. Essentially when the target task of directed yield is capped, can the "yielding" task donate some of its bandwidth? - vatsa