From: Michael Wang <wangyun@linux.vnet.ibm.com>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
linux-kernel@vger.kernel.org,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
tglx@linutronix.de
Subject: Re: [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state
Date: Thu, 31 Jan 2013 14:39:20 +0800 [thread overview]
Message-ID: <510A1198.2070803@linux.vnet.ibm.com> (raw)
In-Reply-To: <8738xiue31.fsf@sejong.aot.lge.com>
On 01/31/2013 01:16 PM, Namhyung Kim wrote:
> Hi Sebastian and Michael,
>
> On Thu, 31 Jan 2013 10:12:35 +0800, Michael Wang wrote:
>> On 01/31/2013 05:19 AM, Sebastian Andrzej Siewior wrote:
>>> If a new CPU has to be choosen for a task, then the scheduler first selects
>>> the group with the least load. This group is returned if its load is lower
>>> compared to the group to which the task is currently assigned.
>>> If there are several groups with completely idle CPU(s) (the CPU is in
>>> an idle state like C1) then the first group is returned.
>>> This patch extends this decision by considering the idle state of CPU(s)
>>> in the group and the first group with a CPU in the lowest idle state
>>> wins (C1 is prefered over C2). If there is a CPU which is not in an idle
>>> state (C0) but has no tasks assigned then it is consider as a valid target.
>>> Should there be no CPU in an idle state at disposal then the loadavg is
>>> used as a fallback.
> [snip]
>>> @@ -3181,8 +3182,10 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>>> int this_cpu, int load_idx)
>>> {
>>> struct sched_group *idlest = NULL, *group = sd->groups;
>>> + struct sched_group *idle_group = NULL;
>>> unsigned long min_load = ULONG_MAX, this_load = 0;
>>> int imbalance = 100 + (sd->imbalance_pct-100)/2;
>>> + int least_idle_cpu = INT_MAX;
>>>
>>> do {
>>> unsigned long load, avg_load;
>>> @@ -3208,6 +3211,25 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>>> load = target_load(i, load_idx);
>>>
>>> avg_load += load;
>>> + if (!local_group && sd->prefer_lp && least_idle_cpu) {
>>> + int idle_level;
>>> +
>>> + idle_level = cpuidle_get_state(i);
>>> + /*
>>> + * Select the CPU which is in the lowest
>>> + * possible power state. Take the active
>>> + * CPU only if its run queue is empty.
>>> + */
>>> + if (!idle_level) {
>>> + if (idle_cpu(i)) {
>>> + least_idle_cpu = idle_level;
>>> + idle_group = group;
>>> + }
>>> + } else if (least_idle_cpu > idle_level) {
>>> + least_idle_cpu = idle_level;
>>> + idle_group = group;
>>> + }
>>> + }
>>> }
>>>
>>> /* Adjust by relative CPU power of the group */
>>> @@ -3221,6 +3243,8 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>>> }
>>> } while (group = group->next, group != sd->groups);
>>>
>>> + if (idle_group)
>>> + return idle_group;
>>
>> I'm not sure, but just concern about this case:
>>
>> group 0 cpu 0 cpu 1
>> least idle 4 task
>>
>> group 1 cpu 2 cpu 3
>> 1 task 1 task
>>
>> The previous logical will pick group 1 and now it will take group 0, and
>> that cause more imbalance, doesn't it?
>>
>> May be check that state in find_idlest_cpu() will be better?
>
> Right, at least find_idlest_cpu() should also check the idle_level IMHO.
>
> Anyway, I have an idea with this in mind. It's like adding a new "idle
> load" to each idle cpu rather than special casing the idle cpus like
> above. IOW an idle cpu will get very small load weight depends on how
> deep it's slept so that it can be compared to other cpus in a same way
> but we can find prefered (lowest load) cpu among the idle cpus.
>
> The simple way I can think of is adding idle_level to a rq load in
> weighted_cpuload():
>
> static unsigned long weighted_cpuload(const int cpu)
> {
> return cpu_rq(cpu)->load.weight + cpuidle_get_state(cpu);
> }
Hmm... then we don't need changes in find_idlest_cpu(), just compare the
load as before, but it works only when the appendix state value is
smaller than the lowest load of one task, which is 15 currently, I'm not
sure whether we have the promise...
Regards,
Michael Wang
>
> What do you think?
>
> Thanks,
> Namhyung
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2013-01-31 6:39 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-30 21:19 [RFC] Consider CPU idle state while choosing a new CPU Sebastian Andrzej Siewior
2013-01-30 21:19 ` [RFC 1/2] cpuidle: trace state of the CPU Sebastian Andrzej Siewior
2013-01-31 5:21 ` Namhyung Kim
2013-02-02 17:35 ` Sebastian Andrzej Siewior
2013-01-30 21:19 ` [RFC 2/2] sched/fair: prefer a CPU in the "lowest" idle state Sebastian Andrzej Siewior
2013-01-31 2:12 ` Michael Wang
2013-01-31 5:16 ` Namhyung Kim
2013-01-31 6:39 ` Michael Wang [this message]
2013-01-31 6:58 ` Namhyung Kim
2013-01-31 7:30 ` Michael Wang
2013-01-31 7:40 ` Namhyung Kim
2013-01-31 8:24 ` Michael Wang
2013-01-31 8:45 ` Michael Wang
2013-01-31 8:57 ` Michael Wang
2013-02-01 8:53 ` Namhyung Kim
2013-02-02 17:50 ` Sebastian Andrzej Siewior
2013-02-04 3:01 ` Michael Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=510A1198.2070803@linux.vnet.ibm.com \
--to=wangyun@linux.vnet.ibm.com \
--cc=bigeasy@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).