public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Brendan Jackman <brendan.jackman@arm.com>
To: Joel Fernandes <joelaf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andres Oportus <andresoportus@google.com>,
	Ingo Molnar <mingo@redhat.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest_group
Date: Tue, 22 Aug 2017 11:39:26 +0100	[thread overview]
Message-ID: <87efs33mzl.fsf@arm.com> (raw)
In-Reply-To: <CAJWu+ooBRXcND26JDvjvR-gGNWSmnV-adnLfy3HAQn23q_xqAg@mail.gmail.com>


On Tue, Aug 22 2017 at 04:34, Joel Fernandes wrote:
> Hi Peter,
>
> On Mon, Aug 21, 2017 at 2:14 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> On Mon, Aug 21, 2017 at 04:21:28PM +0100, Brendan Jackman wrote:
>>> The current use of returning NULL from find_idlest_group is broken in
>>> two cases:
>>>
>>> a1) The local group is not allowed.
>>>
>>>    In this case, we currently do not change this_runnable_load or
>>>    this_avg_load from its initial value of 0, which means we return
>>>    NULL regardless of the load of the other, allowed groups. This
>>>    results in pointlessly continuing the find_idlest_group search
>>>    within the local group and then returning prev_cpu from
>>>    select_task_rq_fair.
>>
>>> b) smp_processor_id() is the "idlest" and != prev_cpu.
>>>
>>>    find_idlest_group also returns NULL when the local group is
>>>    allowed and is the idlest. The caller then continues the
>>>    find_idlest_group search at a lower level of the current CPU's
>>>    sched_domain hierarchy. However new_cpu is not updated. This means
>>>    the search is pointless and we return prev_cpu from
>>>    select_task_rq_fair.
>>>
>>
>> I think its much simpler than that.. but its late, so who knows ;-)
>>
>> Both cases seem predicated on the assumption that we'll return @cpu when
>> we don't find any idler CPU. Consider, if the local group is the idlest,
>> we should stick with @cpu and simply proceed with the child domain.
>>
>> The confusion, and the bugs, seem to have snuck in when we started
>> considering @prev_cpu, whenever that was. The below is mostly code
>> movement to put that whole while(sd) loop into its own function.
>>
>> The effective change is setting @new_cpu = @cpu when we start that loop:
>>
> <snip>
>> ---
>>  kernel/sched/fair.c | 83 +++++++++++++++++++++++++++++++----------------------
>>  1 file changed, 48 insertions(+), 35 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index c77e4b1d51c0..3e77265c480a 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -5588,10 +5588,10 @@ static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
>>  }
>>
>>  /*
>> - * find_idlest_cpu - find the idlest cpu among the cpus in group.
>> + * find_idlest_group_cpu - find the idlest cpu among the cpus in group.
>>   */
>>  static int
>> -find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>> +find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this_cpu)
>>  {
>>         unsigned long load, min_load = ULONG_MAX;
>>         unsigned int min_exit_latency = UINT_MAX;
>> @@ -5640,6 +5640,50 @@ static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
>>         return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu;
>>  }
>>
>> +static int
>> +find_idlest_cpu(struct sched_domain *sd, struct task_struct *p, int cpu, int sd_flag)
>> +{
>> +       struct sched_domain *tmp;
>> +       int new_cpu = cpu;
>> +
>> +       while (sd) {
>> +               struct sched_group *group;
>> +               int weight;
>> +
>> +               if (!(sd->flags & sd_flag)) {
>> +                       sd = sd->child;
>> +                       continue;
>> +               }
>> +
>> +               group = find_idlest_group(sd, p, cpu, sd_flag);
>> +               if (!group) {
>> +                       sd = sd->child;
>> +                       continue;
>
> But this will still have the issue of pointlessly searching in the
> local_group when the idlest CPU is in the non-local group? Stemming
> from the fact that find_idlest_group is broken if the local group is
> not allowed.
>
> I believe this is fixed by Brendan's patch? :
>
> "Initializing this_runnable_load and this_avg_load to ULONG_MAX
>    instead of 0. This means in case a1) we now return the idlest
>    non-local group."
>

Yeah, I don't think this is enough, firstly because of affinity and
secondly because I _guess_ we don't want to migrate from prev_cpu to
@cpu if none of the sched_domains have sd_flag set (...right? That would
feel like wake balancing even when there's no SD_BALANCE_WAKE anywhere).

However the code movement helps - I'll combine it with Vincent's
suggestions and post a v2.

  reply	other threads:[~2017-08-22 10:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-21 15:21 [PATCH 0/2] sched/fair: Tweaks for select_task_rq_fair slowpath Brendan Jackman
2017-08-21 15:21 ` [PATCH 1/2] sched/fair: Remove unnecessary comparison with -1 Brendan Jackman
2017-08-21 17:30   ` Josef Bacik
2017-08-21 15:21 ` [PATCH 2/2] sched/fair: Fix use of NULL with find_idlest_group Brendan Jackman
2017-08-21 17:26   ` Josef Bacik
2017-08-21 17:59     ` Brendan Jackman
2017-08-21 20:22       ` Peter Zijlstra
2017-08-21 21:14   ` Peter Zijlstra
2017-08-21 21:23     ` Peter Zijlstra
2017-08-22  4:34     ` Joel Fernandes
2017-08-22 10:39       ` Brendan Jackman [this message]
2017-08-22 10:45         ` Brendan Jackman
2017-08-22 11:03         ` Peter Zijlstra
2017-08-22 12:46           ` Brendan Jackman
2017-08-22  7:48   ` Vincent Guittot
2017-08-22 10:41     ` Brendan Jackman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87efs33mzl.fsf@arm.com \
    --to=brendan.jackman@arm.com \
    --cc=andresoportus@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox