From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756379Ab3BXRxI (ORCPT ); Sun, 24 Feb 2013 12:53:08 -0500 Received: from e23smtp02.au.ibm.com ([202.81.31.144]:60597 "EHLO e23smtp02.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753934Ab3BXRxH (ORCPT ); Sun, 24 Feb 2013 12:53:07 -0500 Message-ID: <512A5332.6040507@linux.vnet.ibm.com> Date: Sun, 24 Feb 2013 23:21:46 +0530 From: Preeti U Murthy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 MIME-Version: 1.0 To: Alex Shi CC: Peter Zijlstra , torvalds@linux-foundation.org, mingo@redhat.com, tglx@linutronix.de, akpm@linux-foundation.org, arjan@linux.intel.com, bp@alien8.de, pjt@google.com, namhyung@kernel.org, efault@gmx.de, vincent.guittot@linaro.org, gregkh@linuxfoundation.org, viresh.kumar@linaro.org, linux-kernel@vger.kernel.org, morten.rasmussen@arm.com Subject: Re: [patch v5 09/15] sched: add power aware scheduling in fork/exec/wake References: <1361164062-20111-1-git-send-email-alex.shi@intel.com> <1361164062-20111-10-git-send-email-alex.shi@intel.com> <1361353360.10155.9.camel@laptop> <5124BCEB.8030606@intel.com> <1361367371.10155.32.camel@laptop> <5124DC76.2010801@intel.com> <1361453587.26780.18.camel@laptop> <512631C7.8060103@intel.com> <1361523279.26780.45.camel@laptop> <5129DD1A.8070509@intel.com> In-Reply-To: <5129DD1A.8070509@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13022417-5490-0000-0000-00000303932E Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 02/24/2013 02:57 PM, Alex Shi wrote: > On 02/22/2013 04:54 PM, Peter Zijlstra wrote: >> On Thu, 2013-02-21 at 22:40 +0800, Alex Shi wrote: >>>> The name is a secondary issue, first you need to explain why you >>> think >>>> nr_running is a useful metric at all. >>>> >>>> You can have a high nr_running and a low utilization (a burst of >>>> wakeups, each waking a process that'll instantly go to sleep again), >>> or >>>> low nr_running and high utilization (a single process cpu bound >>>> process). >>> >>> It is true in periodic balance. But in fork/exec/waking timing, the >>> incoming processes usually need to do something before sleep again. >> >> You'd be surprised, there's a fair number of workloads that have >> negligible runtime on wakeup. > > will appreciate if you like introduce some workload. :) > BTW, do you has some idea to handle them? > Actually, if tasks is just like transitory, it is also hard to catch > them in balance, like 'cyclitest -t 100' on my 4 LCPU laptop, vmstat > just can catch 1 or 2 tasks very second. >> >>> I use nr_running to measure how the group busy, due to 3 reasons: >>> 1, the current performance policy doesn't use utilization too. >> >> We were planning to fix that now that its available. > > I had tried, but failed on aim9 benchmark. As a result I give up to use > utilization in performance balance. > Some trying and talking in the thread. > https://lkml.org/lkml/2013/1/6/96 > https://lkml.org/lkml/2013/1/22/662 >> >>> 2, the power policy don't care load weight. >> >> Then its broken, it should very much still care about weight. > > Here power policy just use nr_running as the criteria to check if it's > eligible for power aware balance. when do balancing the load weight is > still the key judgment. > >> >>> 3, I tested some benchmarks, kbuild/tbench/hackbench/aim7 etc, some >>> benchmark results looks clear bad when use utilization. if my memory >>> right, the hackbench/aim7 both looks bad. I had tried many ways to >>> engage utilization into this balance, like use utilization only, or >>> use >>> utilization * nr_running etc. but still can not find a way to recover >>> the lose. But with nr_running, the performance seems doesn't lose much >>> with power policy. >> >> You're failing to explain why utilization performs bad and you don't >> explain why nr_running is better. That things work simply isn't good > > Um, let me try to explain again, The utilisation need much time to > accumulate itself(345ms). Whenever with or without load weight, many > bursting tasks just give a minimum weight to the carrier CPU at the > first few ms. So, it is too easy to do a incorrect distribution here and > need migration on later periodic balancing. Why can't this be attacked in *either* of the following ways: 1.Attack this problem at the source, by ensuring that the utilisation is accumulated faster by making the update window smaller. 2.Balance on nr->running only if you detect burst wakeups. Alex, you had released a patch earlier which could detect this right? Instead of balancing on nr_running all the time, why not balance on it only if burst wakeups are detected. By doing so you ensure that nr_running as a metric for load balancing is used when it is right to do so and the reason to use it also gets well documented. Regards Preeti U Murthy