All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@intel.com>
To: Preeti U Murthy <preeti@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: torvalds@linux-foundation.org, mingo@redhat.com,
	tglx@linutronix.de, akpm@linux-foundation.org,
	arjan@linux.intel.com, bp@alien8.de, pjt@google.com,
	namhyung@kernel.org, efault@gmx.de, vincent.guittot@linaro.org,
	gregkh@linuxfoundation.org, viresh.kumar@linaro.org,
	linux-kernel@vger.kernel.org, morten.rasmussen@arm.com
Subject: Re: [patch v5 09/15] sched: add power aware scheduling in fork/exec/wake
Date: Mon, 25 Feb 2013 10:23:53 +0800	[thread overview]
Message-ID: <512ACB39.1010605@intel.com> (raw)
In-Reply-To: <512A5332.6040507@linux.vnet.ibm.com>

On 02/25/2013 01:51 AM, Preeti U Murthy wrote:
> Hi,
> 
> On 02/24/2013 02:57 PM, Alex Shi wrote:
>> On 02/22/2013 04:54 PM, Peter Zijlstra wrote:
>>> On Thu, 2013-02-21 at 22:40 +0800, Alex Shi wrote:
>>>>> The name is a secondary issue, first you need to explain why you
>>>> think
>>>>> nr_running is a useful metric at all.
>>>>>
>>>>> You can have a high nr_running and a low utilization (a burst of
>>>>> wakeups, each waking a process that'll instantly go to sleep again),
>>>> or
>>>>> low nr_running and high utilization (a single process cpu bound
>>>>> process).
>>>>
>>>> It is true in periodic balance. But in fork/exec/waking timing, the
>>>> incoming processes usually need to do something before sleep again.
>>>
>>> You'd be surprised, there's a fair number of workloads that have
>>> negligible runtime on wakeup.
>>
>> will appreciate if you like introduce some workload. :)
>> BTW, do you has some idea to handle them?
>> Actually, if tasks is just like transitory, it is also hard to catch
>> them in balance, like 'cyclitest -t 100' on my 4 LCPU laptop, vmstat
>> just can catch 1 or 2 tasks very second.
>>>
>>>> I use nr_running to measure how the group busy, due to 3 reasons:
>>>> 1, the current performance policy doesn't use utilization too.
>>>
>>> We were planning to fix that now that its available.
>>
>> I had tried, but failed on aim9 benchmark. As a result I give up to use
>> utilization in performance balance.
>> Some trying and talking in the thread.
>> https://lkml.org/lkml/2013/1/6/96
>> https://lkml.org/lkml/2013/1/22/662
>>>
>>>> 2, the power policy don't care load weight.
>>>
>>> Then its broken, it should very much still care about weight.
>>
>> Here power policy just use nr_running as the criteria to check if it's
>> eligible for power aware balance. when do balancing the load weight is
>> still the key judgment.
>>
>>>
>>>> 3, I tested some benchmarks, kbuild/tbench/hackbench/aim7 etc, some
>>>> benchmark results looks clear bad when use utilization. if my memory
>>>> right, the hackbench/aim7 both looks bad. I had tried many ways to
>>>> engage utilization into this balance, like use utilization only, or
>>>> use
>>>> utilization * nr_running etc. but still can not find a way to recover
>>>> the lose. But with nr_running, the performance seems doesn't lose much
>>>> with power policy.
>>>
>>> You're failing to explain why utilization performs bad and you don't
>>> explain why nr_running is better. That things work simply isn't good
>>
>> Um, let me try to explain again, The utilisation need much time to
>> accumulate itself(345ms). Whenever with or without load weight, many
>> bursting tasks just give a minimum weight to the carrier CPU at the
>> first few ms. So, it is too easy to do a incorrect distribution here and
>> need migration on later periodic balancing.
> 
> Why can't this be attacked in *either* of the following ways:
> 
> 1.Attack this problem at the source, by ensuring that the utilisation is
> accumulated faster by making the update window smaller.

It is a double blade sword. Small period will response quickly, but
loses lots of history record. A extreme short period is just same as
current instant utilization.
> 
> 2.Balance on nr->running only if you detect burst wakeups.
> Alex, you had released a patch earlier which could detect this right?

Yes, the patch is here:
https://lkml.org/lkml/2013/1/11/45

One of problem is the how to decide the criteria of the burst? If we set
5 waking up/ms is burst, we will lose 4 waking up/ms.
another problem is the burst detection cost, we need tracking a period
history info of the waking up, better on whole group. but that give the
extra cost in burst.

solution candidates:
https://lkml.org/lkml/2013/1/21/316
After talk with MikeG, I remove the runnable load avg in performance
load balance.

Using nr_running as instant utilization may narrow the power policy
suitable situation. -- consider for power consumption, a light but cpu
intensive task will cost much more power than a heavy load but run
occasionally task. And it fit all my benchmarks
aim7/hackbench/kbuild/cyclitest/netperf etc.

> Instead of balancing on nr_running all the time, why not balance on it
> only if burst wakeups are detected. By doing so you ensure that
> nr_running as a metric for load balancing is used when it is right to do
> so and the reason to use it also gets well documented.
> 
> Regards
> Preeti U Murthy
> 


-- 
Thanks Alex

  reply	other threads:[~2013-02-25  2:23 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-18  5:07 [patch v5 0/15] power aware scheduling Alex Shi
2013-02-18  5:07 ` [patch v5 01/15] sched: set initial value for runnable avg of sched entities Alex Shi
2013-02-18  8:28   ` Joonsoo Kim
2013-02-18  9:16     ` Alex Shi
2013-02-18  5:07 ` [patch v5 02/15] sched: set initial load avg of new forked task Alex Shi
2013-02-20  6:20   ` Alex Shi
2013-02-24 10:57     ` Preeti U Murthy
2013-02-25  6:00       ` Alex Shi
2013-02-28  7:03         ` Preeti U Murthy
2013-02-25  7:12       ` Alex Shi
2013-02-18  5:07 ` [patch v5 03/15] Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" Alex Shi
2013-02-18  5:07 ` [patch v5 04/15] sched: add sched balance policies in kernel Alex Shi
2013-02-20  9:37   ` Ingo Molnar
2013-02-20 13:40     ` Alex Shi
2013-02-20 15:41       ` Ingo Molnar
2013-02-21  1:43         ` Alex Shi
2013-02-18  5:07 ` [patch v5 05/15] sched: add sysfs interface for sched_balance_policy selection Alex Shi
2013-02-18  5:07 ` [patch v5 06/15] sched: log the cpu utilization at rq Alex Shi
2013-02-20  9:30   ` Peter Zijlstra
2013-02-20 12:09     ` Preeti U Murthy
2013-02-20 13:34       ` Peter Zijlstra
2013-02-20 14:36         ` Alex Shi
2013-02-20 14:33     ` Alex Shi
2013-02-20 15:20       ` Peter Zijlstra
2013-02-21  1:35         ` Alex Shi
2013-02-20 15:22       ` Peter Zijlstra
2013-02-25  2:26         ` Alex Shi
2013-03-22  8:49         ` Alex Shi
2013-02-20 12:19   ` Preeti U Murthy
2013-02-20 12:39     ` Alex Shi
2013-02-18  5:07 ` [patch v5 07/15] sched: add new sg/sd_lb_stats fields for incoming fork/exec/wake balancing Alex Shi
2013-02-20  9:38   ` Peter Zijlstra
2013-02-20 12:27     ` Alex Shi
2013-02-18  5:07 ` [patch v5 08/15] sched: move sg/sd_lb_stats struct ahead Alex Shi
2013-02-18  5:07 ` [patch v5 09/15] sched: add power aware scheduling in fork/exec/wake Alex Shi
2013-02-20  9:42   ` Peter Zijlstra
2013-02-20 12:09     ` Alex Shi
2013-02-20 13:36       ` Peter Zijlstra
2013-02-20 14:23         ` Alex Shi
2013-02-21 13:33           ` Peter Zijlstra
2013-02-21 14:40             ` Alex Shi
2013-02-22  8:54               ` Peter Zijlstra
2013-02-24  9:27                 ` Alex Shi
2013-02-24  9:49                   ` Preeti U Murthy
2013-02-24 11:55                     ` Alex Shi
2013-02-24 17:51                   ` Preeti U Murthy
2013-02-25  2:23                     ` Alex Shi [this message]
2013-02-25  3:23                       ` Mike Galbraith
2013-02-25  9:53                         ` Alex Shi
2013-02-25 10:30                           ` Mike Galbraith
2013-02-18  5:07 ` [patch v5 10/15] sched: packing transitory tasks in wake/exec power balancing Alex Shi
2013-02-18  8:44   ` Joonsoo Kim
2013-02-18  8:56     ` Alex Shi
2013-02-20  5:55       ` Alex Shi
2013-02-20  7:40         ` Mike Galbraith
2013-02-20  8:11           ` Alex Shi
2013-02-20  8:43             ` Mike Galbraith
2013-02-20  8:54               ` Alex Shi
2013-02-18  5:07 ` [patch v5 11/15] sched: add power/performance balance allow flag Alex Shi
2013-02-20  9:48   ` Peter Zijlstra
2013-02-20 12:04     ` Alex Shi
2013-02-20 13:37       ` Peter Zijlstra
2013-02-20 13:48         ` Peter Zijlstra
2013-02-20 14:08           ` Alex Shi
2013-02-20 13:52         ` Alex Shi
2013-02-20 12:12   ` Borislav Petkov
2013-02-20 14:20     ` Alex Shi
2013-02-20 15:22       ` Borislav Petkov
2013-02-21  1:32         ` Alex Shi
2013-02-21  9:42           ` Borislav Petkov
2013-02-21 14:52             ` Alex Shi
2013-02-18  5:07 ` [patch v5 12/15] sched: pull all tasks from source group Alex Shi
2013-02-18  5:07 ` [patch v5 13/15] sched: no balance for prefer_sibling in power scheduling Alex Shi
2013-02-18  5:07 ` [patch v5 14/15] sched: power aware load balance Alex Shi
2013-03-20  4:57   ` Preeti U Murthy
2013-03-21  7:43     ` Alex Shi
2013-03-21  8:41       ` Preeti U Murthy
2013-03-21  9:27         ` Alex Shi
2013-03-21 10:27           ` Preeti U Murthy
2013-03-22  1:30             ` Alex Shi
2013-03-22  5:14               ` Preeti U Murthy
2013-03-25  4:52                 ` Alex Shi
2013-03-29 12:42                   ` Preeti U Murthy
2013-03-29 13:39                     ` Alex Shi
2013-03-30 11:25                       ` Preeti U Murthy
2013-03-30 14:04                         ` Alex Shi
2013-03-30 15:31                           ` Preeti U Murthy
2013-02-18  5:07 ` [patch v5 15/15] sched: lazy power balance Alex Shi
2013-02-18  7:44 ` [patch v5 0/15] power aware scheduling Alex Shi
2013-02-19 12:08 ` Paul Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=512ACB39.1010605@intel.com \
    --to=alex.shi@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=efault@gmx.de \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.