Linux Power Management development
 help / color / mirror / Atom feed
From: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "zhenglifeng (A)" <zhenglifeng1@huawei.com>,
	rafael@kernel.org, stratosk@semaphore.gr,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linuxarm@huawei.com, zhanjie9@hisilicon.com,
	lihuisong@huawei.com, yubowen8@huawei.com,
	zhangpengjie2@huawei.com, wangzhi12@huawei.com,
	linhongye@h-partners.com, zhongqiu.han@oss.qualcomm.com
Subject: Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
Date: Mon, 11 May 2026 12:03:46 +0800	[thread overview]
Message-ID: <a028421a-de24-43cd-ac80-17de9b2fc523@oss.qualcomm.com> (raw)
In-Reply-To: <pr25udlbea4vwtx27gfzlnqzv3lmmphkjx5xj2ggzdegmd4ntn@kiytfovoifi3>

On 5/8/2026 6:14 PM, Viresh Kumar wrote:
> On 08-05-26, 14:46, Zhongqiu Han wrote:
>> Nice suggestion! — I think it aligns well with the general direction of
>> addressing this issue and relying more on the cpufreq core.
>>
>> That said, I have a concern if we drop the decrease-path early-exit
>> check while keeping the existing condition:
>>
>>          if (requested_freq > freq_step)
>>
>> unchanged. In that case, when the system is already at policy->min, this
>> appears to introduce a governor-level oscillation, even though the
>> effective hardware frequency remains unchanged:
>>
>> Consider the following scenario
>> (policy->min = 200, freq_step = 100, hardware already at min, low load):
>>
>>    Iteration N:
>>      requested_freq = 200
>>      decrease path: 200 > 100 -> requested_freq = 100
>>      __cpufreq_driver_target(100, LE):
>>          clamp -> 200, target == cur
>>          -> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
>>          -> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
>>      dbs_info->requested_freq = 100 <- below min
>>
>>    Iteration N+1:
>>      requested_freq = 100
>>      out-of-range check: reset to policy->cur = 200
>>      decrease path: 200 > 100 -> requested_freq = 100
>>      dbs_info->requested_freq = 100   <- below min again
>>
>> This results in a repeated oscillation pattern. For drivers with
>> CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
>> the driver may be invoked every sampling period even though the
>> effective frequency does not change. I'm happy to defer to your judgment
>> on whether this is significant enough to warrant further changes.
>>
>> Given this, may i know would it make sense to adjust only the decrease
>> path early-exit condition to use dbs_info->requested_freq (i.e. the last
>> target actually requested from hardware), rather than the idle-adjusted
>> local requested_freq?
>>
>>    -     if (requested_freq == policy->min)
>>    +     if (dbs_info->requested_freq == policy->min)
>>              goto out;
>>
>> With this change, the scenario raised by Lifeng
>> (dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
>> follows:
>>
>>    Iteration 1:
>>      requested_freq = 400
>>      idle decay: requested_freq = 200
>>      if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
>>      200 > 100 -> requested_freq = 100
>>      __cpufreq_driver_target(100, LE):
>>          clamp -> 200, target(200) != cur(400)
>>          -> driver invoked, hardware: 400 -> 200 MHz   [bug fixed]
>>      dbs_info->requested_freq = 100   <- one-time transient value
>>
>>    Iteration 2:
>>      requested_freq = 100
>>      out-of-range check: reset to policy->cur = 200
>>      if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out
>>
>>    -> Stable from iteration 2 onward, with no extra driver calls.
>>
>> There is a one-time transient where dbs_info->requested_freq briefly
>> drops below policy->min, but this is harmless: the existing out-of-range
>> check corrects it in the next iteration. A similar situation was
>> discussed before [1], where the conclusion at the time was that
>> __cpufreq_driver_target() already performs the necessary clamping. If it
>> would be clearer or more robust to add an explicit guard
>> here, this can be revisited as well at your discretion.😊
>>
>> [1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/
>>
>> The increase-path early-exit is intentionally left unchanged, to avoid
>> altering the behavior in the scenario where conservative idle decay
>> would otherwise be ignored when the previous requested frequency was
>> already at policy->max.
> 
> What about this ?
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..c7ec11de9a43 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -103,10 +103,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>          if (load > dbs_data->up_threshold) {
>                  dbs_info->down_skip = 0;
> 
> -               /* if we are already at full speed then break out early */
> -               if (requested_freq == policy->max)
> -                       goto out;
> -
>                  requested_freq += freq_step;
>                  if (requested_freq > policy->max)
>                          requested_freq = policy->max;
> @@ -124,15 +120,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> 
>          /* Check for frequency decrease */
>          if (load < cs_tuners->down_threshold) {
> -               /*
> -                * if we cannot reduce the frequency anymore, break out early
> -                */
> -               if (requested_freq == policy->min)
> -                       goto out;
> -
> -               if (requested_freq > freq_step)
> -                       requested_freq -= freq_step;
> -               else
> +               requested_freq -= freq_step;
> +               if (requested_freq < policy->min)
>                          requested_freq = policy->min;
> 
>                  __cpufreq_driver_target(policy, requested_freq,
> 

Just one small concern is that requested_freq may underflow as an
unsigned value; perhaps this could be improved by:

-		if (requested_freq > freq_step)
+		if (requested_freq > policy->min + freq_step)
  			requested_freq -= freq_step;
  		else
  			requested_freq = policy->min;

or use min() func?

Additionally, it seems that dropping the early‑exit checks also appears
to be a nice side effect fix for CPUFREQ_NEED_UPDATE_LIMITS drivers when
updating internal upper and lower frequency boundaries.

As designed in commit 1c534352f47f ("cpufreq: Introduce
CPUFREQ_NEED_UPDATE_LIMITS driver flag"), a cpufreq driver may need to
update its internal frequency limits when policy min or max changes for
drivers setting CPUFREQ_NEED_UPDATE_LIMITS.

However, the early‑exit in cs_dbs_update() can prevent
__cpufreq_driver_target() from ever being called.

For example, when policy->min rises from 200 to 400 kHz while policy
->cur is already at 400 kHz, under low load:

   cpufreq_policy_apply_limits():
     policy->min(400) > policy->cur(400) ?  NO -> driver not called

   cs_limits():
     dbs_info->requested_freq = policy->cur = 400

   [next sampling period, low load]
   cs_dbs_update():
     requested_freq = 400
     if (requested_freq == policy->min)  /* 400 == 400 -> true */
         goto out;  /* __cpufreq_driver_target() never called */

With the early‑exit removed, __cpufreq_driver_target() is called with
target == policy->cur, and CPUFREQ_NEED_UPDATE_LIMITS ensures the driver
is invoked to update its internal performance boundaries.


-- 
Thx and BRs,
Zhongqiu Han

  parent reply	other threads:[~2026-05-11  4:03 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 12:35 [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target Lifeng Zheng
2026-04-22  8:06 ` Stratos Karafotis
2026-04-22  8:36   ` zhenglifeng (A)
2026-04-23  5:39 ` Zhongqiu Han
2026-04-23  7:12   ` zhenglifeng (A)
2026-05-07  9:33     ` Viresh Kumar
2026-05-08  6:46       ` Zhongqiu Han
2026-05-08 10:14         ` Viresh Kumar
2026-05-09  2:01           ` zhenglifeng (A)
2026-05-11  4:03           ` Zhongqiu Han [this message]
2026-05-09  1:54       ` zhenglifeng (A)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a028421a-de24-43cd-ac80-17de9b2fc523@oss.qualcomm.com \
    --to=zhongqiu.han@oss.qualcomm.com \
    --cc=lihuisong@huawei.com \
    --cc=linhongye@h-partners.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=rafael@kernel.org \
    --cc=stratosk@semaphore.gr \
    --cc=viresh.kumar@linaro.org \
    --cc=wangzhi12@huawei.com \
    --cc=yubowen8@huawei.com \
    --cc=zhangpengjie2@huawei.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox