From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752741Ab3ANHDG (ORCPT ); Mon, 14 Jan 2013 02:03:06 -0500 Received: from LGEMRELSE7Q.lge.com ([156.147.1.151]:48608 "EHLO LGEMRELSE7Q.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752369Ab3ANHDF (ORCPT ); Mon, 14 Jan 2013 02:03:05 -0500 X-AuditID: 9c930197-b7cefae0000002d7-df-50f3ada5a19b From: Namhyung Kim To: Alex Shi Cc: mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, akpm@linux-foundation.org, arjan@linux.intel.com, bp@alien8.de, pjt@google.com, efault@gmx.de, vincent.guittot@linaro.org, gregkh@linuxfoundation.org, preeti@linux.vnet.ibm.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 16/22] sched: add power aware scheduling in fork/exec/wake References: <1357375071-11793-1-git-send-email-alex.shi@intel.com> <1357375071-11793-17-git-send-email-alex.shi@intel.com> Date: Mon, 14 Jan 2013 16:03:01 +0900 In-Reply-To: <1357375071-11793-17-git-send-email-alex.shi@intel.com> (Alex Shi's message of "Sat, 5 Jan 2013 16:37:45 +0800") Message-ID: <87txqknt5m.fsf@sejong.aot.lge.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Brightmail-Tracker: AAAAAA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 5 Jan 2013 16:37:45 +0800, Alex Shi wrote: > This patch add power aware scheduling in fork/exec/wake. It try to > select cpu from the busiest while still has utilization group. That's > will save power for other groups. > > The trade off is adding a power aware statistics collection in group > seeking. But since the collection just happened in power scheduling > eligible condition, the worst case of hackbench testing just drops > about 2% with powersaving/balance policy. No clear change for > performance policy. > > I had tried to use rq load avg utilisation in this balancing, but since > the utilisation need much time to accumulate itself. It's unfit for any > burst balancing. So I use nr_running as instant rq utilisation. > > Signed-off-by: Alex Shi > --- [snip] > +/* > + * Try to collect the task running number and capacity of the doamin. > + */ > +static void get_sd_power_stats(struct sched_domain *sd, > + struct task_struct *p, struct sd_lb_stats *sds) > +{ > + struct sched_group *group; > + struct sg_lb_stats sgs; > + int sd_min_delta = INT_MAX; > + int cpu = task_cpu(p); > + > + group = sd->groups; > + do { > + long g_delta; > + unsigned long threshold; > + > + if (!cpumask_test_cpu(cpu, sched_group_mask(group))) > + continue; Why? That means only local group's stat will be accounted for this domain, right? Is it your intension? Thanks, Namhyung > + > + memset(&sgs, 0, sizeof(sgs)); > + get_sg_power_stats(group, sd, &sgs); > + > + if (sched_policy == SCHED_POLICY_POWERSAVING) > + threshold = sgs.group_weight; > + else > + threshold = sgs.group_capacity; > + > + g_delta = threshold - sgs.group_utils; > + > + if (g_delta > 0 && g_delta < sd_min_delta) { > + sd_min_delta = g_delta; > + sds->group_leader = group; > + } > + > + sds->sd_utils += sgs.group_utils; > + sds->total_pwr += group->sgp->power; > + } while (group = group->next, group != sd->groups); > + > + sds->sd_capacity = DIV_ROUND_CLOSEST(sds->total_pwr, > + SCHED_POWER_SCALE); > +}