public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
To: Juri Lelli <juri.lelli@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	viresh.kumar@linaro.org, peterz@infradead.org,
	vincent.guittot@linaro.org
Subject: Re: [PATCH] sched/cpufreq_schedutil: use now as reference when aggregating shared policy requests
Date: Thu, 04 May 2017 16:29:13 +0200	[thread overview]
Message-ID: <1597987.Ujfy2HKTBn@aspire.rjw.lan> (raw)
In-Reply-To: <20170503133048.8742-1-juri.lelli@arm.com>

On Wednesday, May 03, 2017 02:30:48 PM Juri Lelli wrote:
> Currently, sugov_next_freq_shared() uses last_freq_update_time as a
> reference to decide when to start considering CPU contributions as
> stale.
> 
> However, since last_freq_update_time is set by the last CPU that issued
> a frequency transition, this might cause problems in certain cases. In
> practice, the detection of stale utilization values fails whenever the
> CPU with such values was the last to update the policy. For example (and
> please note again that the SCHED_CPUFREQ_RT flag is not the problem
> here, but only the detection of after how much time that flag has to be
> considered stale), suppose a policy with 2 CPUs:
> 
>                CPU0                |               CPU1
>                                    |
>                                    |     RT task scheduled
>                                    |     SCHED_CPUFREQ_RT is set
>                                    |     CPU1->last_update = now
>                                    |     freq transition to max
>                                    |     last_freq_update_time = now
>                                    |
> 
>                         more than TICK_NSEC nsecs
> 
>                                    |
>      a small CFS wakes up          |
>      CPU0->last_update = now1      |
>      delta_ns(CPU0) < TICK_NSEC*   |
>      CPU0's util is considered     |
>      delta_ns(CPU1) =              |
>       last_freq_update_time -      |
>       CPU1->last_update = 0        |
>       < TICK_NSEC                  |
>      CPU1 is still considered      |
>      CPU1->SCHED_CPUFREQ_RT is set |
>      we stay at max (until CPU1    |
>      exits from idle)              |
> 
> * delta_ns is actually negative as now1 > last_freq_update_time
> 
> While last_freq_update_time is a sensible reference for rate limiting,
> it doesn't seem to be useful for working around stale CPU states.
> 
> Fix the problem by always considering now (time) as the reference for
> deciding when CPUs have stale contributions.
> 
> Signed-off-by: Juri Lelli <juri.lelli@arm.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>

OK

I'll queue this up if there are no objections from the people in the CC.

> ---
>  kernel/sched/cpufreq_schedutil.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 76877a62b5fa..622eed1b7658 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -245,11 +245,10 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
>  	sugov_update_commit(sg_policy, time, next_f);
>  }
>  
> -static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu)
> +static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu, u64 time)
>  {
>  	struct sugov_policy *sg_policy = sg_cpu->sg_policy;
>  	struct cpufreq_policy *policy = sg_policy->policy;
> -	u64 last_freq_update_time = sg_policy->last_freq_update_time;
>  	unsigned long util = 0, max = 1;
>  	unsigned int j;
>  
> @@ -265,7 +264,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu)
>  		 * enough, don't take the CPU into account as it probably is
>  		 * idle now (and clear iowait_boost for it).
>  		 */
> -		delta_ns = last_freq_update_time - j_sg_cpu->last_update;
> +		delta_ns = time - j_sg_cpu->last_update;
>  		if (delta_ns > TICK_NSEC) {
>  			j_sg_cpu->iowait_boost = 0;
>  			continue;
> @@ -309,7 +308,7 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
>  		if (flags & SCHED_CPUFREQ_RT_DL)
>  			next_f = sg_policy->policy->cpuinfo.max_freq;
>  		else
> -			next_f = sugov_next_freq_shared(sg_cpu);
> +			next_f = sugov_next_freq_shared(sg_cpu, time);
>  
>  		sugov_update_commit(sg_policy, time, next_f);
>  	}
> 

Thanks,
Rafael

  reply	other threads:[~2017-05-04 14:35 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-03 13:30 [PATCH] sched/cpufreq_schedutil: use now as reference when aggregating shared policy requests Juri Lelli
2017-05-04 14:29 ` Rafael J. Wysocki [this message]
2017-05-04 14:40   ` Juri Lelli
2017-05-04 14:41 ` Vincent Guittot
2017-05-05  6:06 ` Viresh Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1597987.Ujfy2HKTBn@aspire.rjw.lan \
    --to=rjw@rjwysocki.net \
    --cc=juri.lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox