From: Brendan Jackman <brendan.jackman@arm.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Joel Fernandes <joelaf@google.com>,
Andres Oportus <andresoportus@google.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Josef Bacik <josef@toxicpanda.com>,
Morten Rasmussen <morten.rasmussen@arm.com>
Subject: Re: [PATCH v2 5/5] sched/fair: Fix use of find_idlest_group when local group is idlest.
Date: Fri, 25 Aug 2017 16:51:52 +0100 [thread overview]
Message-ID: <87378f1w87.fsf@arm.com> (raw)
In-Reply-To: <CAKfTPtCKAK0mZcSnp4akUaCe9kjsfyqodab1GL0zxbACij-+aQ@mail.gmail.com>
On Fri, Aug 25 2017 at 13:38, Vincent Guittot wrote:
> On 25 August 2017 at 12:16, Brendan Jackman <brendan.jackman@arm.com> wrote:
>> find_idlest_group currently returns NULL when the local group is
>> idlest. The caller then continues the find_idlest_group search at a
>> lower level of the current CPU's sched_domain hierarchy. However
>> find_idlest_group_cpu is not consulted and, crucially, @new_cpu is
>> not updated. This means the search is pointless and we return
>> @prev_cpu from select_task_rq_fair.
>>
>> This patch makes find_idlest_group return the idlest group, and never
>> NULL, and then has the caller unconditionally continue to consult
>> find_idlest_group_cpu and update @new_cpu.
>>
>> * * *
>>
>> An alternative, simpler, fix for this would be to initialize @new_cpu
>> to @cpu instead of @prev_cpu, at the beginning of the new
>> find_idlest_cpu. However this would not fix the fact that
>
> Yes this is simpler
>
>> find_idlest_group_cpu goes unused, and we miss out on the bias toward
>> a shallower-idle CPU, unless find_idlest_group picks a non-local
>> group.
>
> I'm not sure to catch why it's not enough.
> If no cpu of sched_domain span is allowed, we continue to return prev_cpu
> But if at least 1 cpu is allowed, the default new_cpu is cpu in case
> it is really the idlest and no other group will be returned by
> find_idlest_group
> Otherwise, find_idlest_group will finally return another group than
> the local one when running with sd->child
This patch isn't about affinity but just about which group is idlest.
Consider a 4 CPU system with 2 shared caches where CPU0's sched_domains
look like:
DIE: [0 1][2 3]
MC: [0][1]
Suppose CPUs 0 and 1 are equally idle wrt load and util, but CPU1 is in
a shallower idle state. Suppose all CPUs are allowed.
Assuming @cpu is 0, on the first iteration, find_idlest_group (fig) will
currently return NULL meaning [0 1], then on the second iteration it
will return NULL meaning [0].
With this patch, on the first iteration fig will return [0 1], then
find_idlest_group_cpu (figc) will return 1. Then on the second
iteration, fig should return [1] (because it is biased towards the local
group i.e. [@cpu] i.e. [1]). This is better because CPU1 is in a
shallower idle state.
So basically we're currently missing out on the biasing in figc, unless
we at some point return a non-local group.
Does that make sense?
>
> Is there sched_domain topology where the sched_group of lowest level
> is not with only 1 cpu ?
>
>>
>> If that alternative solution was preferred, then some code could be
>> removed in recognition of the fact that when find_idlest_group_cpu
>> was called, @cpu would never be a member of @group (because @group
>> would be a non-local group, and since @sd is derived from @cpu, @cpu
>> would always be in @sd's local group).
>>
>> Signed-off-by: Brendan Jackman <brendan.jackman@arm.com>
>> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
>> Cc: Vincent Guittot <vincent.guittot@linaro.org>
>> Cc: Josef Bacik <josef@toxicpanda.com>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Morten Rasmussen <morten.rasmussen@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> ---
>> kernel/sched/fair.c | 29 ++++++++++-------------------
>> 1 file changed, 10 insertions(+), 19 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 26080917ff8d..93e2746d3c30 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -5384,11 +5384,10 @@ static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
>> * Assumes p is allowed on at least one CPU in sd.
>> */
>> static struct sched_group *
>> -find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>> - int this_cpu, int sd_flag)
>> +find_idlest_group(struct sched_domain *sd, struct task_struct *p, int sd_flag)
>> {
>> - struct sched_group *idlest = NULL, *group = sd->groups;
>> - struct sched_group *most_spare_sg = NULL;
>> + struct sched_group *group = sd->groups, *local_group = sd->groups;
>> + struct sched_group *idlest = NULL, *most_spare_sg = NULL;
>> unsigned long min_runnable_load = ULONG_MAX;
>> unsigned long this_runnable_load = ULONG_MAX;
>> unsigned long min_avg_load = ULONG_MAX, this_avg_load = ULONG_MAX;
>> @@ -5404,7 +5403,6 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>> do {
>> unsigned long load, avg_load, runnable_load;
>> unsigned long spare_cap, max_spare_cap;
>> - int local_group;
>> int i;
>>
>> /* Skip over this group if it has no CPUs allowed */
>> @@ -5412,9 +5410,6 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>> &p->cpus_allowed))
>> continue;
>>
>> - local_group = cpumask_test_cpu(this_cpu,
>> - sched_group_span(group));
>> -
>> /*
>> * Tally up the load of all CPUs in the group and find
>> * the group containing the CPU with most spare capacity.
>> @@ -5425,7 +5420,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>>
>> for_each_cpu(i, sched_group_span(group)) {
>> /* Bias balancing toward cpus of our domain */
>> - if (local_group)
>> + if (group == local_group)
>> load = source_load(i, load_idx);
>> else
>> load = target_load(i, load_idx);
>> @@ -5446,7 +5441,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>> runnable_load = (runnable_load * SCHED_CAPACITY_SCALE) /
>> group->sgc->capacity;
>>
>> - if (local_group) {
>> + if (group == local_group) {
>> this_runnable_load = runnable_load;
>> this_avg_load = avg_load;
>> this_spare = max_spare_cap;
>> @@ -5492,21 +5487,21 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
>>
>> if (this_spare > task_util(p) / 2 &&
>> imbalance_scale*this_spare > 100*most_spare)
>> - return NULL;
>> + return local_group;
>>
>> if (most_spare > task_util(p) / 2)
>> return most_spare_sg;
>>
>> skip_spare:
>> if (!idlest)
>> - return NULL;
>> + return local_group;
>>
>> if (min_runnable_load > (this_runnable_load + imbalance))
>> - return NULL;
>> + return local_group;
>>
>> if ((this_runnable_load < (min_runnable_load + imbalance)) &&
>> (100*this_avg_load < imbalance_scale*min_avg_load))
>> - return NULL;
>> + return local_group;
>>
>> return idlest;
>> }
>> @@ -5582,11 +5577,7 @@ static inline int find_idlest_cpu(struct sched_domain *sd, struct task_struct *p
>> continue;
>> }
>>
>> - group = find_idlest_group(sd, p, cpu, sd_flag);
>> - if (!group) {
>> - sd = sd->child;
>> - continue;
>> - }
>> + group = find_idlest_group(sd, p, sd_flag);
>>
>> new_cpu = find_idlest_group_cpu(group, p, cpu);
>> if (new_cpu == cpu) {
>> --
>> 2.14.1
>>
next prev parent reply other threads:[~2017-08-25 15:51 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-25 10:16 [PATCH v2 0/5] sched/fair: Tweaks for select_task_rq_fair slowpath Brendan Jackman
2017-08-25 10:16 ` [PATCH v2 1/5] sched/fair: Move select_task_rq_fair slow-path into its own function Brendan Jackman
2017-08-25 12:22 ` Josef Bacik
2017-08-25 13:40 ` Vincent Guittot
2017-08-25 10:16 ` [PATCH v2 2/5] sched/fair: Remove unnecessary comparison with -1 Brendan Jackman
2017-08-25 12:44 ` Josef Bacik
2017-08-25 13:40 ` Vincent Guittot
2017-08-25 10:16 ` [PATCH v2 3/5] sched/fair: Fix find_idlest_group when local group is not allowed Brendan Jackman
2017-08-25 13:39 ` Vincent Guittot
2017-08-25 10:16 ` [PATCH v2 4/5] sched/fair: Fix use of find_idlest_group when no groups are allowed Brendan Jackman
2017-08-25 13:39 ` Vincent Guittot
2017-08-25 10:16 ` [PATCH v2 5/5] sched/fair: Fix use of find_idlest_group when local group is idlest Brendan Jackman
2017-08-25 13:38 ` Vincent Guittot
2017-08-25 15:51 ` Brendan Jackman [this message]
2017-08-28 8:56 ` Vincent Guittot
2017-08-28 9:11 ` Vincent Guittot
2017-08-29 17:09 ` Brendan Jackman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87378f1w87.fsf@arm.com \
--to=brendan.jackman@arm.com \
--cc=andresoportus@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=joelaf@google.com \
--cc=josef@toxicpanda.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.