All of lore.kernel.org
 help / color / mirror / Atom feed
From: kernellwp@gmail.com (Wanpeng Li)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v4 01/12] sched: fix imbalance flag reset
Date: Tue, 25 Nov 2014 18:13:42 +0800	[thread overview]
Message-ID: <54745656.40707@gmail.com> (raw)
In-Reply-To: <CAKfTPtBS6Ri-hNf+M_BGDxhpXEwWP=n0zmTi5-pdxGi0dMF6eQ@mail.gmail.com>

Hi Vincent,
On 11/25/14, 5:04 PM, Vincent Guittot wrote:
> On 25 November 2014 at 00:47, Wanpeng Li <kernellwp@gmail.com> wrote:
>> Hi Vincent,
>> On 7/29/14, 1:51 AM, Vincent Guittot wrote:
>>> The imbalance flag can stay set whereas there is no imbalance.
>>>
>>> Let assume that we have 3 tasks that run on a dual cores /dual cluster
>>> system.
>>> We will have some idle load balance which are triggered during tick.
>>> Unfortunately, the tick is also used to queue background work so we can
>>> reach
>>> the situation where short work has been queued on a CPU which already runs
>>> a
>>> task. The load balance will detect this imbalance (2 tasks on 1 CPU and an
>>> idle
>>> CPU) and will try to pull the waiting task on the idle CPU. The waiting
>>> task is
>>> a worker thread that is pinned on a CPU so an imbalance due to pinned task
>>> is
>>> detected and the imbalance flag is set.
>>> Then, we will not be able to clear the flag because we have at most 1 task
>>> on
>>> each CPU but the imbalance flag will trig to useless active load balance
>>> between the idle CPU and the busy CPU.
>>>
>>> We need to reset of the imbalance flag as soon as we have reached a
>>> balanced
>>> state. If all tasks are pinned, we don't consider that as a balanced state
>>> and
>>> let the imbalance flag set.
>>>
>>> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>>> ---
>>>    kernel/sched/fair.c | 23 +++++++++++++++++++----
>>>    1 file changed, 19 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 923fe32..7eb9126 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -6672,10 +6672,8 @@ static int load_balance(int this_cpu, struct rq
>>> *this_rq,
>>>                  if (sd_parent) {
>>>                          int *group_imbalance =
>>> &sd_parent->groups->sgc->imbalance;
>>>    -                     if ((env.flags & LBF_SOME_PINNED) && env.imbalance
>>>> 0) {
>>> +                       if ((env.flags & LBF_SOME_PINNED) && env.imbalance
>>>> 0)
>>>                                  *group_imbalance = 1;
>>> -                       } else if (*group_imbalance)
>>> -                               *group_imbalance = 0;
>>
>> As you mentioned above " We need to reset of the imbalance flag as soon as
>> we have reached a balanced state. " I think the codes before your patch have
>> already do this, where I miss? Great thanks for your patient. ;-)
> The previous code was called only when busiest->nr_running > 1.  The
> background activity will be on the rq only 1 tick per few seconds and
> we will set qroup_imbalance when the background activity is on the rq.
> Then, during the next load balances, the qroup_imbalance is still set
> but we can't clear qroup_imbalance  because we have only 1 task per rq

There is no load balance I think since busiest->nr_running > 1 is not 
true even if the patch is not applied.

Regards,
Wanpeng Li

>
> Regards,
> Vincent
>
>> Regards,
>> Wanpeng Li
>>
>>
>>>                  }
>>>                  /* All tasks on this runqueue were pinned by CPU affinity
>>> */
>>> @@ -6686,7 +6684,7 @@ static int load_balance(int this_cpu, struct rq
>>> *this_rq,
>>>                                  env.loop_break = sched_nr_migrate_break;
>>>                                  goto redo;
>>>                          }
>>> -                       goto out_balanced;
>>> +                       goto out_all_pinned;
>>>                  }
>>>          }
>>>    @@ -6760,6 +6758,23 @@ static int load_balance(int this_cpu, struct rq
>>> *this_rq,
>>>          goto out;
>>>      out_balanced:
>>> +       /*
>>> +        * We reach balance although we may have faced some affinity
>>> +        * constraints. Clear the imbalance flag if it was set.
>>> +        */
>>> +       if (sd_parent) {
>>> +               int *group_imbalance = &sd_parent->groups->sgc->imbalance;
>>> +
>>> +               if (*group_imbalance)
>>> +                       *group_imbalance = 0;
>>> +       }
>>> +
>>> +out_all_pinned:
>>> +       /*
>>> +        * We reach balance because all tasks are pinned at this level so
>>> +        * we can't migrate them. Let the imbalance flag set so parent
>>> level
>>> +        * can try to migrate them.
>>> +        */
>>>          schedstat_inc(sd, lb_balanced[idle]);
>>>          sd->nr_balance_failed = 0;
>>

WARNING: multiple messages have this Message-ID (diff)
From: Wanpeng Li <kernellwp@gmail.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Preeti U Murthy <preeti@linux.vnet.ibm.com>,
	Russell King - ARM Linux <linux@arm.linux.org.uk>,
	LAK <linux-arm-kernel@lists.infradead.org>,
	Rik van Riel <riel@redhat.com>,
	Morten Rasmussen <Morten.Rasmussen@arm.com>,
	Mike Galbraith <efault@gmx.de>,
	Nicolas Pitre <nicolas.pitre@linaro.org>,
	"linaro-kernel@lists.linaro.org" <linaro-kernel@lists.linaro.org>,
	Daniel Lezcano <daniel.lezcano@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>
Subject: Re: [PATCH v4 01/12] sched: fix imbalance flag reset
Date: Tue, 25 Nov 2014 18:13:42 +0800	[thread overview]
Message-ID: <54745656.40707@gmail.com> (raw)
In-Reply-To: <CAKfTPtBS6Ri-hNf+M_BGDxhpXEwWP=n0zmTi5-pdxGi0dMF6eQ@mail.gmail.com>

Hi Vincent,
On 11/25/14, 5:04 PM, Vincent Guittot wrote:
> On 25 November 2014 at 00:47, Wanpeng Li <kernellwp@gmail.com> wrote:
>> Hi Vincent,
>> On 7/29/14, 1:51 AM, Vincent Guittot wrote:
>>> The imbalance flag can stay set whereas there is no imbalance.
>>>
>>> Let assume that we have 3 tasks that run on a dual cores /dual cluster
>>> system.
>>> We will have some idle load balance which are triggered during tick.
>>> Unfortunately, the tick is also used to queue background work so we can
>>> reach
>>> the situation where short work has been queued on a CPU which already runs
>>> a
>>> task. The load balance will detect this imbalance (2 tasks on 1 CPU and an
>>> idle
>>> CPU) and will try to pull the waiting task on the idle CPU. The waiting
>>> task is
>>> a worker thread that is pinned on a CPU so an imbalance due to pinned task
>>> is
>>> detected and the imbalance flag is set.
>>> Then, we will not be able to clear the flag because we have at most 1 task
>>> on
>>> each CPU but the imbalance flag will trig to useless active load balance
>>> between the idle CPU and the busy CPU.
>>>
>>> We need to reset of the imbalance flag as soon as we have reached a
>>> balanced
>>> state. If all tasks are pinned, we don't consider that as a balanced state
>>> and
>>> let the imbalance flag set.
>>>
>>> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>>> ---
>>>    kernel/sched/fair.c | 23 +++++++++++++++++++----
>>>    1 file changed, 19 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 923fe32..7eb9126 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -6672,10 +6672,8 @@ static int load_balance(int this_cpu, struct rq
>>> *this_rq,
>>>                  if (sd_parent) {
>>>                          int *group_imbalance =
>>> &sd_parent->groups->sgc->imbalance;
>>>    -                     if ((env.flags & LBF_SOME_PINNED) && env.imbalance
>>>> 0) {
>>> +                       if ((env.flags & LBF_SOME_PINNED) && env.imbalance
>>>> 0)
>>>                                  *group_imbalance = 1;
>>> -                       } else if (*group_imbalance)
>>> -                               *group_imbalance = 0;
>>
>> As you mentioned above " We need to reset of the imbalance flag as soon as
>> we have reached a balanced state. " I think the codes before your patch have
>> already do this, where I miss? Great thanks for your patient. ;-)
> The previous code was called only when busiest->nr_running > 1.  The
> background activity will be on the rq only 1 tick per few seconds and
> we will set qroup_imbalance when the background activity is on the rq.
> Then, during the next load balances, the qroup_imbalance is still set
> but we can't clear qroup_imbalance  because we have only 1 task per rq

There is no load balance I think since busiest->nr_running > 1 is not 
true even if the patch is not applied.

Regards,
Wanpeng Li

>
> Regards,
> Vincent
>
>> Regards,
>> Wanpeng Li
>>
>>
>>>                  }
>>>                  /* All tasks on this runqueue were pinned by CPU affinity
>>> */
>>> @@ -6686,7 +6684,7 @@ static int load_balance(int this_cpu, struct rq
>>> *this_rq,
>>>                                  env.loop_break = sched_nr_migrate_break;
>>>                                  goto redo;
>>>                          }
>>> -                       goto out_balanced;
>>> +                       goto out_all_pinned;
>>>                  }
>>>          }
>>>    @@ -6760,6 +6758,23 @@ static int load_balance(int this_cpu, struct rq
>>> *this_rq,
>>>          goto out;
>>>      out_balanced:
>>> +       /*
>>> +        * We reach balance although we may have faced some affinity
>>> +        * constraints. Clear the imbalance flag if it was set.
>>> +        */
>>> +       if (sd_parent) {
>>> +               int *group_imbalance = &sd_parent->groups->sgc->imbalance;
>>> +
>>> +               if (*group_imbalance)
>>> +                       *group_imbalance = 0;
>>> +       }
>>> +
>>> +out_all_pinned:
>>> +       /*
>>> +        * We reach balance because all tasks are pinned at this level so
>>> +        * we can't migrate them. Let the imbalance flag set so parent
>>> level
>>> +        * can try to migrate them.
>>> +        */
>>>          schedstat_inc(sd, lb_balanced[idle]);
>>>          sd->nr_balance_failed = 0;
>>


  reply	other threads:[~2014-11-25 10:13 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-28 17:51 [PATCH v4 00/12] sched: consolidation of cpu_capacity Vincent Guittot
2014-07-28 17:51 ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 01/12] sched: fix imbalance flag reset Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-11-23 10:25   ` Wanpeng Li
2014-11-23 10:25     ` Wanpeng Li
2014-11-24 10:31     ` Vincent Guittot
2014-11-24 10:31       ` Vincent Guittot
2014-11-24 23:47   ` Wanpeng Li
2014-11-24 23:47     ` Wanpeng Li
2014-11-25  9:04     ` Vincent Guittot
2014-11-25  9:04       ` Vincent Guittot
2014-11-25 10:13       ` Wanpeng Li [this message]
2014-11-25 10:13         ` Wanpeng Li
2014-07-28 17:51 ` [PATCH v4 02/12] sched: remove a wake_affine condition Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 03/12] sched: fix avg_load computation Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 04/12] sched: Allow all archs to set the capacity_orig Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 05/12] ARM: topology: use new cpu_capacity interface Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH 06/12] sched: add per rq cpu_capacity_orig Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 07/12] sched: test the cpu's capacity in wake affine Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 08/12] sched: move cfs task on a CPU with higher capacity Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 18:43   ` Rik van Riel
2014-07-28 18:43     ` Rik van Riel
2014-07-29  7:40     ` Vincent Guittot
2014-07-29  7:40       ` Vincent Guittot
2014-07-28 17:51 ` [PATCH 09/12] sched: add usage_load_avg Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 10/12] sched: get CPU's utilization statistic Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 11/12] sched: replace capacity_factor by utilization Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 17:51 ` [PATCH v4 12/12] sched: add SD_PREFER_SIBLING for SMT level Vincent Guittot
2014-07-28 17:51   ` Vincent Guittot
2014-07-28 18:52 ` [PATCH v4 00/12] sched: consolidation of cpu_capacity Rik van Riel
2014-07-28 18:52   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54745656.40707@gmail.com \
    --to=kernellwp@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.