From: kernellwp@gmail.com (Wanpeng Li)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v6 4/6] sched: get CPU's usage statistic
Date: Fri, 21 Nov 2014 13:36:12 +0800 [thread overview]
Message-ID: <546ECF4C.2040304@gmail.com> (raw)
In-Reply-To: <CAKfTPtCzVRY4q184nqHSL7zFh1CvF009p8q8EPFCZxDFMx9TeQ@mail.gmail.com>
Hi Vincent,
On 9/26/14, 8:17 PM, Vincent Guittot wrote:
> On 25 September 2014 21:05, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>> On 23/09/14 17:08, Vincent Guittot wrote:
>>> Monitor the usage level of each group of each sched_domain level. The usage is
>>> the amount of cpu_capacity that is currently used on a CPU or group of CPUs.
>>> We use the utilization_load_avg to evaluate the usage level of each group.
>>>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>>> ---
>>> kernel/sched/fair.c | 13 +++++++++++++
>>> 1 file changed, 13 insertions(+)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 2cf153d..4097e3f 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -4523,6 +4523,17 @@ static int select_idle_sibling(struct task_struct *p, int target)
>>> return target;
>>> }
>>>
>>> +static int get_cpu_usage(int cpu)
>>> +{
>>> + unsigned long usage = cpu_rq(cpu)->cfs.utilization_load_avg;
>>> + unsigned long capacity = capacity_orig_of(cpu);
>>> +
>>> + if (usage >= SCHED_LOAD_SCALE)
>>> + return capacity + 1;
>> Why you are returning rq->cpu_capacity_orig + 1 (1025) in case
>> utilization_load_avg is greater or equal than 1024 and not usage or
>> (usage * capacity) >> SCHED_LOAD_SHIFT too?
> The usage can't be higher than the full capacity of the CPU because
> it's about the running time on this CPU. Nevertheless, usage can be
> higher than SCHED_LOAD_SCALE because of unfortunate rounding in
> avg_period and running_load_avg or just after migrating tasks until
> the average stabilizes with the new running time.
>
>> In case the weight of a sched group is greater than 1, you might loose
>> the information that the whole sched group is over-utilized too.
> that's exactly for sched_group with more than 1 CPU that we need to
> cap the usage of a CPU to 100%. Otherwise, the group could be seen as
> overloaded (CPU0 usage at 121% + CPU1 usage at 80%) whereas CPU1 has
> 20% of available capacity
>
>> You add up the individual cpu usage values for a group by
>> sgs->group_usage += get_cpu_usage(i) in update_sg_lb_stats and later use
>> sgs->group_usage in group_is_overloaded to compare it against
>> sgs->group_capacity (taking imbalance_pct into consideration).
>>
>>> +
>>> + return (usage * capacity) >> SCHED_LOAD_SHIFT;
>> Nit-pick: Since you're multiplying by a capacity value
>> (rq->cpu_capacity_orig) you should shift by SCHED_CAPACITY_SHIFT.
> we want to compare the output of the function with some capacity
> figures so i think that >> SCHED_LOAD_SHIFT is the right operation.
Could you explain more why '>> SCHED_LOAD_SHIFT' instead of '>>
SCHED_CAPACITY_SHIFT'?
Regards,
Wanpeng Li
>
>> Just to make sure: You do this scaling of usage by cpu_capacity_orig
>> here only to cater for the fact that cpu_capacity_orig might be uarch
>> scaled (by arch_scale_cpu_capacity, !SMT) in update_cpu_capacity while
> I do this for any system with CPUs that have an original capacity that
> is different from SCHED_CAPACITY_SCALE so it's for both uArch and SMT.
>
>> utilization_load_avg is currently not.
>> We don't even uArch scale on ARM TC2 big.LITTLE platform in mainline
>> today due to the missing clock-frequency property in the device tree.
> sorry i don't catch your point
>
>> I think it's hard for people to grasp that your patch-set takes uArch
>> scaling of capacity into consideration but not frequency scaling of
>> capacity (via arch_scale_freq_capacity, not used at the moment).
>>
>>> +}
>>> +
>>> /*
>>> * select_task_rq_fair: Select target runqueue for the waking task in domains
>>> * that have the 'sd_flag' flag set. In practice, this is SD_BALANCE_WAKE,
>>> @@ -5663,6 +5674,7 @@ struct sg_lb_stats {
>>> unsigned long sum_weighted_load; /* Weighted load of group's tasks */
>>> unsigned long load_per_task;
>>> unsigned long group_capacity;
>>> + unsigned long group_usage; /* Total usage of the group */
>>> unsigned int sum_nr_running; /* Nr tasks running in the group */
>>> unsigned int group_capacity_factor;
>>> unsigned int idle_cpus;
>>> @@ -6037,6 +6049,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>>> load = source_load(i, load_idx);
>>>
>>> sgs->group_load += load;
>>> + sgs->group_usage += get_cpu_usage(i);
>>> sgs->sum_nr_running += rq->cfs.h_nr_running;
>>>
>>> if (rq->nr_running > 1)
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
WARNING: multiple messages have this Message-ID (diff)
From: Wanpeng Li <kernellwp@gmail.com>
To: Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: "peterz@infradead.org" <peterz@infradead.org>,
"mingo@kernel.org" <mingo@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"preeti@linux.vnet.ibm.com" <preeti@linux.vnet.ibm.com>,
"linux@arm.linux.org.uk" <linux@arm.linux.org.uk>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"riel@redhat.com" <riel@redhat.com>,
Morten Rasmussen <Morten.Rasmussen@arm.com>,
"efault@gmx.de" <efault@gmx.de>,
"nicolas.pitre@linaro.org" <nicolas.pitre@linaro.org>,
"linaro-kernel@lists.linaro.org" <linaro-kernel@lists.linaro.org>,
"daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>,
"pjt@google.com" <pjt@google.com>,
"bsegall@google.com" <bsegall@google.com>
Subject: Re: [PATCH v6 4/6] sched: get CPU's usage statistic
Date: Fri, 21 Nov 2014 13:36:12 +0800 [thread overview]
Message-ID: <546ECF4C.2040304@gmail.com> (raw)
In-Reply-To: <CAKfTPtCzVRY4q184nqHSL7zFh1CvF009p8q8EPFCZxDFMx9TeQ@mail.gmail.com>
Hi Vincent,
On 9/26/14, 8:17 PM, Vincent Guittot wrote:
> On 25 September 2014 21:05, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>> On 23/09/14 17:08, Vincent Guittot wrote:
>>> Monitor the usage level of each group of each sched_domain level. The usage is
>>> the amount of cpu_capacity that is currently used on a CPU or group of CPUs.
>>> We use the utilization_load_avg to evaluate the usage level of each group.
>>>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>>> ---
>>> kernel/sched/fair.c | 13 +++++++++++++
>>> 1 file changed, 13 insertions(+)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 2cf153d..4097e3f 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -4523,6 +4523,17 @@ static int select_idle_sibling(struct task_struct *p, int target)
>>> return target;
>>> }
>>>
>>> +static int get_cpu_usage(int cpu)
>>> +{
>>> + unsigned long usage = cpu_rq(cpu)->cfs.utilization_load_avg;
>>> + unsigned long capacity = capacity_orig_of(cpu);
>>> +
>>> + if (usage >= SCHED_LOAD_SCALE)
>>> + return capacity + 1;
>> Why you are returning rq->cpu_capacity_orig + 1 (1025) in case
>> utilization_load_avg is greater or equal than 1024 and not usage or
>> (usage * capacity) >> SCHED_LOAD_SHIFT too?
> The usage can't be higher than the full capacity of the CPU because
> it's about the running time on this CPU. Nevertheless, usage can be
> higher than SCHED_LOAD_SCALE because of unfortunate rounding in
> avg_period and running_load_avg or just after migrating tasks until
> the average stabilizes with the new running time.
>
>> In case the weight of a sched group is greater than 1, you might loose
>> the information that the whole sched group is over-utilized too.
> that's exactly for sched_group with more than 1 CPU that we need to
> cap the usage of a CPU to 100%. Otherwise, the group could be seen as
> overloaded (CPU0 usage at 121% + CPU1 usage at 80%) whereas CPU1 has
> 20% of available capacity
>
>> You add up the individual cpu usage values for a group by
>> sgs->group_usage += get_cpu_usage(i) in update_sg_lb_stats and later use
>> sgs->group_usage in group_is_overloaded to compare it against
>> sgs->group_capacity (taking imbalance_pct into consideration).
>>
>>> +
>>> + return (usage * capacity) >> SCHED_LOAD_SHIFT;
>> Nit-pick: Since you're multiplying by a capacity value
>> (rq->cpu_capacity_orig) you should shift by SCHED_CAPACITY_SHIFT.
> we want to compare the output of the function with some capacity
> figures so i think that >> SCHED_LOAD_SHIFT is the right operation.
Could you explain more why '>> SCHED_LOAD_SHIFT' instead of '>>
SCHED_CAPACITY_SHIFT'?
Regards,
Wanpeng Li
>
>> Just to make sure: You do this scaling of usage by cpu_capacity_orig
>> here only to cater for the fact that cpu_capacity_orig might be uarch
>> scaled (by arch_scale_cpu_capacity, !SMT) in update_cpu_capacity while
> I do this for any system with CPUs that have an original capacity that
> is different from SCHED_CAPACITY_SCALE so it's for both uArch and SMT.
>
>> utilization_load_avg is currently not.
>> We don't even uArch scale on ARM TC2 big.LITTLE platform in mainline
>> today due to the missing clock-frequency property in the device tree.
> sorry i don't catch your point
>
>> I think it's hard for people to grasp that your patch-set takes uArch
>> scaling of capacity into consideration but not frequency scaling of
>> capacity (via arch_scale_freq_capacity, not used at the moment).
>>
>>> +}
>>> +
>>> /*
>>> * select_task_rq_fair: Select target runqueue for the waking task in domains
>>> * that have the 'sd_flag' flag set. In practice, this is SD_BALANCE_WAKE,
>>> @@ -5663,6 +5674,7 @@ struct sg_lb_stats {
>>> unsigned long sum_weighted_load; /* Weighted load of group's tasks */
>>> unsigned long load_per_task;
>>> unsigned long group_capacity;
>>> + unsigned long group_usage; /* Total usage of the group */
>>> unsigned int sum_nr_running; /* Nr tasks running in the group */
>>> unsigned int group_capacity_factor;
>>> unsigned int idle_cpus;
>>> @@ -6037,6 +6049,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
>>> load = source_load(i, load_idx);
>>>
>>> sgs->group_load += load;
>>> + sgs->group_usage += get_cpu_usage(i);
>>> sgs->sum_nr_running += rq->cfs.h_nr_running;
>>>
>>> if (rq->nr_running > 1)
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2014-11-21 5:36 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-23 16:07 [PATCH v6 0/6] sched: consolidation of cpu_capacity Vincent Guittot
2014-09-23 16:07 ` Vincent Guittot
2014-09-23 16:08 ` [PATCH v6 1/6] sched: add per rq cpu_capacity_orig Vincent Guittot
2014-09-23 16:08 ` Vincent Guittot
2014-09-23 16:08 ` [PATCH v6 2/6] sched: move cfs task on a CPU with higher capacity Vincent Guittot
2014-09-23 16:08 ` Vincent Guittot
2014-09-23 16:08 ` [PATCH v6 3/6] sched: add utilization_avg_contrib Vincent Guittot
2014-09-23 16:08 ` Vincent Guittot
2014-10-03 14:15 ` Peter Zijlstra
2014-10-03 14:15 ` Peter Zijlstra
2014-10-03 14:44 ` Vincent Guittot
2014-10-03 14:44 ` Vincent Guittot
2014-10-03 14:36 ` Peter Zijlstra
2014-10-03 14:36 ` Peter Zijlstra
2014-10-03 14:51 ` Vincent Guittot
2014-10-03 14:51 ` Vincent Guittot
2014-10-03 15:14 ` Peter Zijlstra
2014-10-03 15:14 ` Peter Zijlstra
2014-10-03 16:05 ` Morten Rasmussen
2014-10-03 16:05 ` Morten Rasmussen
2014-09-23 16:08 ` [PATCH v6 4/6] sched: get CPU's usage statistic Vincent Guittot
2014-09-23 16:08 ` Vincent Guittot
2014-09-25 19:05 ` Dietmar Eggemann
2014-09-25 19:05 ` Dietmar Eggemann
2014-09-26 12:17 ` Vincent Guittot
2014-09-26 12:17 ` Vincent Guittot
2014-09-26 15:58 ` Morten Rasmussen
2014-09-26 15:58 ` Morten Rasmussen
2014-09-26 19:57 ` Dietmar Eggemann
2014-09-26 19:57 ` Dietmar Eggemann
2014-11-21 5:36 ` Wanpeng Li [this message]
2014-11-21 5:36 ` Wanpeng Li
2014-11-21 12:17 ` Vincent Guittot
2014-11-21 12:17 ` Vincent Guittot
2014-09-23 16:08 ` [PATCH v6 5/6] sched: replace capacity_factor by usage Vincent Guittot
2014-09-23 16:08 ` Vincent Guittot
2014-09-24 17:48 ` Dietmar Eggemann
2014-09-24 17:48 ` Dietmar Eggemann
2014-09-25 8:35 ` Vincent Guittot
2014-09-25 8:35 ` Vincent Guittot
2014-09-25 19:19 ` Dietmar Eggemann
2014-09-25 19:19 ` Dietmar Eggemann
2014-09-26 12:39 ` Vincent Guittot
2014-09-26 12:39 ` Vincent Guittot
2014-09-26 14:00 ` Dietmar Eggemann
2014-09-26 14:00 ` Dietmar Eggemann
2014-09-25 8:38 ` Vincent Guittot
2014-09-25 8:38 ` Vincent Guittot
2014-09-29 13:39 ` Dietmar Eggemann
2014-09-29 13:39 ` Dietmar Eggemann
2014-10-02 16:57 ` Morten Rasmussen
2014-10-02 16:57 ` Morten Rasmussen
2014-10-03 7:24 ` Vincent Guittot
2014-10-03 7:24 ` Vincent Guittot
2014-10-03 9:35 ` Morten Rasmussen
2014-10-03 9:35 ` Morten Rasmussen
2014-10-03 12:50 ` Vincent Guittot
2014-10-03 12:50 ` Vincent Guittot
2014-11-23 0:22 ` Wanpeng Li
2014-11-23 0:22 ` Wanpeng Li
2014-11-24 8:26 ` Vincent Guittot
2014-11-24 8:26 ` Vincent Guittot
2014-10-03 15:38 ` Peter Zijlstra
2014-10-03 15:38 ` Peter Zijlstra
2014-10-06 8:55 ` Vincent Guittot
2014-10-06 8:55 ` Vincent Guittot
2014-09-23 16:08 ` [PATCH v6 6/6] sched: add SD_PREFER_SIBLING for SMT level Vincent Guittot
2014-09-23 16:08 ` Vincent Guittot
2014-09-24 12:27 ` Preeti U Murthy
2014-09-24 12:27 ` Preeti U Murthy
2014-09-25 12:10 ` Vincent Guittot
2014-09-25 12:10 ` Vincent Guittot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=546ECF4C.2040304@gmail.com \
--to=kernellwp@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.