Re: [patch v7 0/21] sched: power aware scheduling

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Alex Shi <alex.shi@intel.com>
To: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Mike Galbraith <bitbucket@online.de>,
	Ingo Molnar <mingo@kernel.org>, Len Brown <lenb@kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de,
	akpm@linux-foundation.org, arjan@linux.intel.com, pjt@google.com,
	namhyung@kernel.org, morten.rasmussen@arm.com,
	vincent.guittot@linaro.org, gregkh@linuxfoundation.org,
	viresh.kumar@linaro.org, linux-kernel@vger.kernel.org,
	len.brown@intel.com, rafael.j.wysocki@intel.com, jkosina@suse.cz,
	clark.williams@gmail.com, tony.luck@intel.com,
	keescook@chromium.org, mgorman@suse.de, riel@redhat.com,
	Linux PM list <linux-pm@vger.kernel.org>
Subject: Re: [patch v7 0/21] sched: power aware scheduling
Date: Mon, 20 May 2013 09:01:41 +0800	[thread overview]
Message-ID: <519975F5.20400@intel.com> (raw)
In-Reply-To: <5195E4F9.60908@linux.vnet.ibm.com>


>>>>> Which are the workloads where 'powersaving' mode hurts workload 
>>>>> performance measurably?
> 
> I ran ebizzy on a 2 socket, 16 core, SMT 4 Power machine.

Is this a 2 * 16 * 4 LCPUs PowerPC machine?
> The power efficiency drops significantly with the powersaving policy of
> this patch,over the power efficiency of the scheduler without this patch.
> 
> The below parameters are measured relative to the default scheduler
> behaviour.
> 
> A: Drop in power efficiency with the patch+powersaving policy
> B: Drop in performance with the patch+powersaving policy
> C: Decrease in power consumption with the patch+powersaving policy
> 
> NumThreads      A            B         C
> -----------------------------------------
> 2               33%         36%       4%
> 4               31%         33%       3%
> 8               28%         30%       3%
> 16              31%         33%       4%
> 
> Each of the above run is for 30s.
> 
> On investigating socket utilization,I found that only 1 socket was being
> used during all the above threaded runs. As can be guessed this is due
> to the group_weight being considered for the threshold metric.
> This stacks up tasks on a core and further on a socket, thus throttling
> them, as observed by Mike below.
> 
> I therefore think we must switch to group_capacity as the metric for
> threshold and use only (rq->utils*nr_running) for group_utils
> calculation during non-bursty wakeup scenarios.
> This way we are comparing right; the utilization of the runqueue by the
> fair tasks and the cpu capacity available for them after being consumed
> by the rt tasks.
> 
> After I made the above modification,all the above three parameters came
> to be nearly null. However, I am observing the load balancing of the
> scheduler with the patch and powersavings policy enabled. It is behaving
> very close to the default scheduler (spreading tasks across sockets).
> That also explains why there is no performance drop or gain with the
> patch+powersavings policy enabled. I will look into this observation and
> revert.

Thanks a lot for the great testings!
Seem tasks per SMT cpu isn't power efficient.
And I got the similar result last week. I tested the fspin testing(do
endless calculation, in linux-next tree.). when I bind task per SMT cpu,
the power efficiency really dropped with most every threads number. but
when bind task per core, it has better power efficiency on all threads.
Beside to move task depend on group_capacity, another choice is balance
task according cpu_power. I did the transfer in code. but need to go
through a internal open source process before public them.
> 
>>>>
>>>> Well, it'll lose throughput any time there's parallel execution
>>>> potential but it's serialized instead.. using average will inevitably
>>>> stack tasks sometimes, but that's its goal.  Hackbench shows it.
>>>
>>> (but that consolidation can be a winner too, and I bet a nickle it would
>>> be for a socket sized pgbench run)
>>
>> (belay that, was thinking of keeping all tasks on a single node, but
>> it'll likely stack the whole thing on a CPU or two, if so, it'll hurt)
> 
> At this point, I would like to raise one issue.
> *Is the goal of the power aware scheduler improving power efficiency of
> the scheduler or a compromise on the power efficiency but definitely a
> decrease in power consumption, since it is the user who has decided to
> prioritise lower power consumption over performance* ?
> 

It could be one of reason for this feather, but I could like to
make it has better efficiency, like packing tasks according to cpu_power
not current group_weight.
>>
> 
> Regards
> Preeti U Murthy
> 


-- 
Thanks
    Alex

next prev parent reply	other threads:[~2013-05-20  1:02 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-04  2:00 [patch v7 0/21] sched: power aware scheduling Alex Shi
2013-04-04  2:00 ` [patch v7 01/21] Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" Alex Shi
2013-04-04  2:00 ` [patch v7 02/21] sched: set initial value of runnable avg for new forked task Alex Shi
2013-05-06  3:24   ` Preeti U Murthy
2013-04-04  2:00 ` [patch v7 03/21] sched: add sched balance policies in kernel Alex Shi
2013-04-04  2:00 ` [patch v7 04/21] sched: add sysfs interface for sched_balance_policy selection Alex Shi
2013-04-04  2:00 ` [patch v7 05/21] sched: log the cpu utilization at rq Alex Shi
2013-05-06  3:26   ` Preeti U Murthy
2013-05-06  5:22     ` Alex Shi
2013-05-06 12:03   ` Phil Carmody
2013-05-06 12:35     ` Alex Shi
2013-05-06 21:19   ` Paul Turner
2013-04-04  2:00 ` [patch v7 06/21] sched: add new sg/sd_lb_stats fields for incoming fork/exec/wake balancing Alex Shi
2013-04-04  2:00 ` [patch v7 07/21] sched: move sg/sd_lb_stats struct ahead Alex Shi
2013-04-04  2:00 ` [patch v7 08/21] sched: scale_rt_power rename and meaning change Alex Shi
2013-04-04  2:00 ` [patch v7 09/21] sched: get rq potential maximum utilization Alex Shi
2013-04-04  2:00 ` [patch v7 10/21] sched: add power aware scheduling in fork/exec/wake Alex Shi
2013-04-04  2:00 ` [patch v7 11/21] sched: add sched_burst_threshold_ns as wakeup burst indicator Alex Shi
2013-04-04  2:00 ` [patch v7 12/21] sched: using avg_idle to detect bursty wakeup Alex Shi
2013-04-04  2:00 ` [patch v7 13/21] sched: packing transitory tasks in wakeup power balancing Alex Shi
2013-04-04  2:00 ` [patch v7 14/21] sched: add power/performance balance allow flag Alex Shi
2013-04-04  2:00 ` [patch v7 15/21] sched: pull all tasks from source group Alex Shi
2013-04-04  5:59   ` Namhyung Kim
2013-04-06 11:49     ` Alex Shi
2013-04-04  2:00 ` [patch v7 16/21] sched: no balance for prefer_sibling in power scheduling Alex Shi
2013-05-06  3:26   ` Preeti U Murthy
2013-04-04  2:00 ` [patch v7 17/21] sched: add new members of sd_lb_stats Alex Shi
2013-04-04  2:00 ` [patch v7 18/21] sched: power aware load balance Alex Shi
2013-04-04  2:01 ` [patch v7 19/21] sched: lazy power balance Alex Shi
2013-04-04  2:01 ` [patch v7 20/21] sched: don't do power balance on share cpu power domain Alex Shi
2013-04-08  3:17   ` Preeti U Murthy
2013-04-08  3:25     ` Preeti U Murthy
2013-04-08  4:19       ` Alex Shi
2013-04-04  2:01 ` [patch v7 21/21] sched: make sure select_tas_rq_fair get a cpu Alex Shi
2013-04-11 21:02 ` [patch v7 0/21] sched: power aware scheduling Len Brown
2013-04-12  8:46   ` Alex Shi
2013-04-12 16:23     ` Borislav Petkov
2013-04-12 16:48       ` Mike Galbraith
2013-04-12 17:12         ` Borislav Petkov
2013-04-14  1:36           ` Alex Shi
2013-04-17 21:53         ` Len Brown
2013-04-18  1:51           ` Mike Galbraith
2013-04-26 15:11           ` Mike Galbraith
2013-04-30  5:16             ` Mike Galbraith
2013-04-30  8:30               ` Mike Galbraith
2013-04-30  8:41                 ` Ingo Molnar
2013-04-30  9:35                   ` Mike Galbraith
2013-04-30  9:49                     ` Mike Galbraith
2013-04-30  9:56                       ` Mike Galbraith
2013-05-17  8:06                         ` Preeti U Murthy
2013-05-20  1:01                           ` Alex Shi [this message]
2013-05-20  2:30                             ` Preeti U Murthy
2013-04-14  1:28       ` Alex Shi
2013-04-14  5:10         ` Alex Shi
2013-04-14 15:59         ` Borislav Petkov
2013-04-15  6:04           ` Alex Shi
2013-04-15  6:16             ` Alex Shi
2013-04-15  9:52               ` Borislav Petkov
2013-04-15 13:50                 ` Alex Shi
2013-04-15 23:12                   ` Borislav Petkov
2013-04-16  0:22                     ` Alex Shi
2013-04-16 10:24                       ` Borislav Petkov
2013-04-17  1:18                         ` Alex Shi
2013-04-17  7:38                           ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519975F5.20400@intel.com \
    --to=alex.shi@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=bitbucket@online.de \
    --cc=bp@alien8.de \
    --cc=clark.williams@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jkosina@suse.cz \
    --cc=keescook@chromium.org \
    --cc=len.brown@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox