linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephan Gerhold <stephan.gerhold@linaro.org>
To: Sultan Alsawaf <sultan@kerneltoast.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	regressions@lists.linux.dev, Johan Hovold <johan@kernel.org>
Subject: Re: [PATCH 1/2] cpufreq: schedutil: Fix superfluous updates caused by need_freq_update
Date: Tue, 8 Apr 2025 18:48:06 +0200	[thread overview]
Message-ID: <Z_VTRspvmOUfrawh@linaro.org> (raw)
In-Reply-To: <Z_U_LN0AtH_n4YtE@sultan-box.localdomain>

Hi Sultan,

On Tue, Apr 08, 2025 at 08:22:20AM -0700, Sultan Alsawaf wrote:
> On Tue, Apr 08, 2025 at 10:59:31AM +0200, Stephan Gerhold wrote:
> > On Wed, Dec 11, 2024 at 05:57:32PM -0800, Sultan Alsawaf wrote:
> > > From: "Sultan Alsawaf (unemployed)" <sultan@kerneltoast.com>
> > > 
> > > A redundant frequency update is only truly needed when there is a policy
> > > limits change with a driver that specifies CPUFREQ_NEED_UPDATE_LIMITS.
> > > 
> > > In spite of that, drivers specifying CPUFREQ_NEED_UPDATE_LIMITS receive a
> > > frequency update _all the time_, not just for a policy limits change,
> > > because need_freq_update is never cleared.
> > > 
> > > Furthermore, ignore_dl_rate_limit()'s usage of need_freq_update also leads
> > > to a redundant frequency update, regardless of whether or not the driver
> > > specifies CPUFREQ_NEED_UPDATE_LIMITS, when the next chosen frequency is the
> > > same as the current one.
> > > 
> > > Fix the superfluous updates by only honoring CPUFREQ_NEED_UPDATE_LIMITS
> > > when there's a policy limits change, and clearing need_freq_update when a
> > > requisite redundant update occurs.
> > > 
> > > This is neatly achieved by moving up the CPUFREQ_NEED_UPDATE_LIMITS test
> > > and instead setting need_freq_update to false in sugov_update_next_freq().
> > > 
> > > Signed-off-by: Sultan Alsawaf (unemployed) <sultan@kerneltoast.com>
> > > ---
> > >  kernel/sched/cpufreq_schedutil.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > > index 28c77904ea74..e51d5ce730be 100644
> > > --- a/kernel/sched/cpufreq_schedutil.c
> > > +++ b/kernel/sched/cpufreq_schedutil.c
> > > @@ -83,7 +83,7 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> > >  
> > >  	if (unlikely(sg_policy->limits_changed)) {
> > >  		sg_policy->limits_changed = false;
> > > -		sg_policy->need_freq_update = true;
> > > +		sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> > >  		return true;
> > >  	}
> > >  
> > > @@ -96,7 +96,7 @@ static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> > >  				   unsigned int next_freq)
> > >  {
> > >  	if (sg_policy->need_freq_update)
> > > -		sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
> > > +		sg_policy->need_freq_update = false;
> > >  	else if (sg_policy->next_freq == next_freq)
> > >  		return false;
> > >  
> > 
> > This patch breaks cpufreq throttling (e.g. for thermal cooling) for
> > cpufreq drivers that:
> > 
> >  - Have policy->fast_switch_enabled/fast_switch_possible set, but
> >  - Do not have CPUFREQ_NEED_UPDATE_LIMITS flag set
> > 
> > There are several examples for this in the tree (search for
> > "fast_switch_possible"). Of all those drivers, only intel-pstate and
> > amd-pstate (sometimes) set CPUFREQ_NEED_UPDATE_LIMITS.
> > 
> > I can reliably reproduce this with scmi-cpufreq on a Qualcomm X1E
> > laptop:
> > 
> >  1. I added some low temperature trip points in the device tree,
> >     together with passive cpufreq cooling.
> >  2. I run a CPU stress test on all CPUs and monitor the temperatures
> >     and CPU frequencies.
> > 
> > When using "performance" governor instead of "schedutil", the CPU
> > frequencies are being throttled as expected, as soon as the temperature
> > trip points are reached.
> > 
> > When using "schedutil", the CPU frequencies stay at maximum as long as
> > the stress test is running. No throttling happens, so the device heats
> > up far beyond the defined temperature trip points. Throttling is applied
> > only after stopping the stress test, since this forces schedutil to
> > re-evaluate the CPU frequency.
> > 
> > Reverting this commit fixes the problem.
> > 
> > Looking at the code, I think the problem is that:
> >  - sg_policy->limits_changed does not result in
> >    sg->policy->need_freq_update without CPUFREQ_NEED_UPDATE_LIMITS
> >    anymore, and
> >  - Without sg->policy->need_freq_update, get_next_freq() skips calling
> >    cpufreq_driver_resolve_freq(), which would normally apply the policy
> >    min/max constraints.
> > 
> > Do we need to set CPUFREQ_NEED_UPDATE_LIMITS for all cpufreq drivers
> > that set policy->fast_switch_possible? If I'm reading the documentation
> > comment correctly, that flag is just supposed to enable notifications if
> > the policy min/max changes, but the resolved target frequency is still
> > the same. This is not the case here, the target frequency needs to be
> > throttled, but schedutil isn't applying the new limits.
> > 
> > Any suggestions how to fix this? I'm happy to test patches with my
> > setup.
> 
> Thank you for reporting this. As I see it, sg_policy->need_freq_update is
> working correctly now; however, sg_policy->limits_changed relied on the broken
> behavior of sg_policy->need_freq_update and therefore sg_policy->limits_changed
> needs to be fixed.

Thanks for the quick reply and the patch!

> 
> Can you try this patch:
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 1a19d69b91ed3..f37b999854d52 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -82,7 +82,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>  		return false;
>  
>  	if (unlikely(sg_policy->limits_changed)) {
> -		sg_policy->limits_changed = false;
>  		sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
>  		return true;
>  	}
> @@ -171,9 +170,11 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
>  	freq = get_capacity_ref_freq(policy);
>  	freq = map_util_freq(util, freq, max);
>  
> -	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> +	if (freq == sg_policy->cached_raw_freq && !sg_policy->limits_changed &&
> +	    !sg_policy->need_freq_update)
>  		return sg_policy->next_freq;
>  
> +	sg_policy->limits_changed = false;
>  	sg_policy->cached_raw_freq = freq;
>  	return cpufreq_driver_resolve_freq(policy, freq);
>  }
> 

This is working correctly for me, CPU frequency is being throttled again
when the temperature trip points are reached. If you send this, feel
free to add:

Tested-by: Stephan Gerhold <stephan.gerhold@linaro.org>

Thanks!
Stephan

  reply	other threads:[~2025-04-08 16:48 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-12  1:57 [PATCH 2/2] cpufreq: schedutil: Ignore rate limit when scaling up with FIE present Sultan Alsawaf
2024-12-12  1:57 ` [PATCH 1/2] cpufreq: schedutil: Fix superfluous updates caused by need_freq_update Sultan Alsawaf
2024-12-12 13:24   ` Christian Loehle
2024-12-14  2:35     ` Sultan Alsawaf (unemployed)
2024-12-18 15:10     ` Rafael J. Wysocki
2025-04-08  8:59   ` Stephan Gerhold
2025-04-08 15:22     ` Sultan Alsawaf
2025-04-08 16:48       ` Stephan Gerhold [this message]
2025-04-09 11:25         ` Xuewen Yan
2025-04-09 11:48           ` Xuewen Yan
2025-04-10  1:49             ` Sultan Alsawaf
2025-04-10  2:06               ` Xuewen Yan
2025-04-10  2:08                 ` Sultan Alsawaf
2025-04-10  2:13                   ` Xuewen Yan
2025-04-10  2:22                     ` Sultan Alsawaf
2025-04-10  2:30                       ` Xuewen Yan
2025-04-10  2:33                         ` Sultan Alsawaf
2025-04-10  2:42                           ` Xuewen Yan
2025-04-10  1:52         ` Sultan Alsawaf
2025-04-10 19:16     ` Rafael J. Wysocki
2024-12-12 12:06 ` [PATCH 2/2] cpufreq: schedutil: Ignore rate limit when scaling up with FIE present Christian Loehle
2024-12-14  2:15   ` Sultan Alsawaf (unemployed)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_VTRspvmOUfrawh@linaro.org \
    --to=stephan.gerhold@linaro.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=johan@kernel.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=regressions@lists.linux.dev \
    --cc=rostedt@goodmis.org \
    --cc=sultan@kerneltoast.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).