All of lore.kernel.org
 help / color / mirror / Atom feed
From: "zhuguangqing83" <zhuguangqing83@gmail.com>
To: "'Viresh Kumar'" <viresh.kumar@linaro.org>
Cc: <rjw@rjwysocki.net>, <mingo@redhat.com>, <peterz@infradead.org>,
	<juri.lelli@redhat.com>, <vincent.guittot@linaro.org>,
	<dietmar.eggemann@arm.com>, <rostedt@goodmis.org>,
	<bsegall@google.com>, <mgorman@suse.de>, <bristot@redhat.com>,
	<linux-pm@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	"'zhuguangqing'" <zhuguangqing@xiaomi.com>
Subject: Re: [PATCH] cpufreq: schedutil: set sg_policy->next_freq to the final cpufreq
Date: Thu, 29 Oct 2020 09:43:20 +0800	[thread overview]
Message-ID: <095901d6ad94$e48d5140$ada7f3c0$@gmail.com> (raw)


> On 28-10-20, 19:03, zhuguangqing83 wrote:
> > Thanks for your comments. Maybe my description was not clear before.
> >
> > If I understand correctly, when policy->min/max get changed in the time
> > Window between get_next_freq() and sugov_fast_switch(), to be more
> > precise between cpufreq_driver_resolve_freq() and
> > cpufreq_driver_fast_switch(), the issue may happen.
> >
> > For example, the first time schedutil callback gets called from the
> > scheduler, we reached get_next_freq() and calculate the next_freq,
> > suppose next_freq is 1.0 GHz, then sg_policy->next_freq is updated
> > to 1.0 GHz in sugov_update_next_freq(). If policy->min/max get
> > change right now, suppose policy->min is changed to 1.2 GHz,
> > then the final next_freq is 1.2 GHz for there is another clamp
> > between policy->min and policy->max in cpufreq_driver_fast_switch().
> > Then sg_policy->next_freq(1.0 GHz) is not the final next_freq(1.2 GHz).
> >
> > The second time schedutil callback gets called from the scheduler, there
> > are two issues:
> > (1) Suppose policy->min is still 1.2 GHz, we reached get_next_freq() and
> > calculate the next_freq, because sg_policy->limits_changed gets set to
> > true by sugov_limits() and there is a clamp between policy->min and
> > policy->max, so this time next_freq will be greater than or equal to 1.2
> > GHz, suppose next_freq is also 1.2 GHz. Now next_freq is 1.2 GHz and
> > sg_policy->next_freq is 1.0 GHz,  then we find
> > "if (sg_policy->next_freq == next_freq)" is not satisfied and we call
> > cpufreq driver to change the cpufreq to 1.2 GHz. Actually it's already
> > 1.2 GHz, it's not necessary to change this time.
> 
> This isn't that bad, but ...
> 
> > (2) Suppose policy->min was changed again to 1.0 GHz before, we reached
> > get_next_freq() and calculate the next_freq, suppose next_freq is also
> > 1.0 GHz. Now next_freq is 1.0 GHz and sg_policy->next_freq is also 1.0 GHz,
> > then we find "if (sg_policy->next_freq == next_freq)" is satisfied and we
> > don't change the cpufreq. Actually we should change the cpufreq to 1.0 GHz
> > this time.
> 
> This is a real problem we can get into. What about this diff instead ?
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 0c5c61a095f6..bf7800e853d3 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -105,7 +105,6 @@ static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
>         if (sg_policy->next_freq == next_freq)
>                 return false;
> 
> -       sg_policy->next_freq = next_freq;
>         sg_policy->last_freq_update_time = time;
> 
>         return true;

It's a little strange that sg_policy->next_freq and 
sg_policy->last_freq_update_time are not updated at the same time.

> @@ -115,7 +114,7 @@ static void sugov_fast_switch(struct sugov_policy *sg_policy, u64 time,
>                               unsigned int next_freq)
>  {
>         if (sugov_update_next_freq(sg_policy, time, next_freq))
> -               cpufreq_driver_fast_switch(sg_policy->policy, next_freq);
> +               sg_policy->next_freq = cpufreq_driver_fast_switch(sg_policy->policy, next_freq);
>  }
> 

Great, it also takes into account the issue that 0 is returned by the
driver's ->fast_switch() callback to indicate an error condition.

For policy->min/max may be not the real CPU frequency in OPPs, so
next_freq got from get_next_freq() which is after clamping between
policy->min and policy->max may be not the real CPU frequency in OPPs.
In that case, if we use a real CPU frequency in OPPs returned from
cpufreq_driver_fast_switch() to compare with next_freq,
"if (sg_policy->next_freq == next_freq)" will never be satisfied, so we
change the CPU frequency every time schedutil callback gets called from
the scheduler. I see the current code in get_next_freq() as following,
the issue mentioned above should not happen. Maybe it's just one of my
unnecessary worries.

static unsigned int get_next_freq(struct sugov_policy *sg_policy,
				  unsigned long util, unsigned long max)
{
......
	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
		return sg_policy->next_freq;
......
}

>  static void sugov_deferred_update(struct sugov_policy *sg_policy, u64 time,
> @@ -124,6 +123,7 @@ static void sugov_deferred_update(struct sugov_policy *sg_policy, u64 time,
>         if (!sugov_update_next_freq(sg_policy, time, next_freq))
>                 return;
> 
> +       sg_policy->next_freq = next_freq;
>         if (!sg_policy->work_in_progress) {
>                 sg_policy->work_in_progress = true;
>                 irq_work_queue(&sg_policy->irq_work);
> 
> --
> viresh


             reply	other threads:[~2020-10-29  1:44 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-29  1:43 zhuguangqing83 [this message]
2020-10-29  7:19 ` [PATCH] cpufreq: schedutil: set sg_policy->next_freq to the final cpufreq Viresh Kumar
2020-10-29 12:52   ` Rafael J. Wysocki
2020-10-30  7:21     ` [PATCH] cpufreq: schedutil: Don't skip freq update if need_freq_update is set Viresh Kumar
2020-10-30 15:07       ` Rafael J. Wysocki
2020-10-30 15:23         ` Rafael J. Wysocki
2020-11-02  4:43           ` Viresh Kumar
  -- strict thread matches above, loose matches on Subject: below --
2020-10-29 11:17 [PATCH] cpufreq: schedutil: set sg_policy->next_freq to the final cpufreq zhuguangqing83
2020-10-29 11:26 ` Viresh Kumar
2020-10-28 11:03 zhuguangqing83
2020-10-28 15:35 ` Viresh Kumar
2020-10-27 11:54 zhuguangqing83
2020-10-28  8:21 ` Viresh Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='095901d6ad94$e48d5140$ada7f3c0$@gmail.com' \
    --to=zhuguangqing83@gmail.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=zhuguangqing@xiaomi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.