Linux Power Management development
 help / color / mirror / Atom feed
From: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
To: Viresh Kumar <viresh.kumar@linaro.org>,
	"zhenglifeng (A)" <zhenglifeng1@huawei.com>
Cc: rafael@kernel.org, stratosk@semaphore.gr,
	linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linuxarm@huawei.com, zhanjie9@hisilicon.com,
	lihuisong@huawei.com, yubowen8@huawei.com,
	zhangpengjie2@huawei.com, wangzhi12@huawei.com,
	linhongye@h-partners.com, zhongqiu.han@oss.qualcomm.com
Subject: Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
Date: Fri, 8 May 2026 14:46:58 +0800	[thread overview]
Message-ID: <80d32bdb-f661-44cb-b529-3d5ce2142af0@oss.qualcomm.com> (raw)
In-Reply-To: <aoo2k2uzi6zg6owrmi7qgcgjryxzk7l7f2mdo6vtjjdd576hpn@rg6zyqgn7grr>

On 5/7/2026 5:33 PM, Viresh Kumar wrote:
> On 23-04-26, 15:12, zhenglifeng (A) wrote:
>> Yes, I think you are right. The behaviors are not the same. I modified this
>> just in order to keep it consistent with the case exceeding down_threshold.
>> I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
>> Viresh could give us some advice.
> 
>>>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>>>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>>>            dbs_info->down_skip = 0;
>>>>              /* if we are already at full speed then break out early */
>>>> -        if (requested_freq == policy->max)
>>>> +        if (dbs_info->requested_freq == policy->max)
>>>>                goto out;
> 
>>>> -        if (requested_freq == policy->min)
>>>> +        if (dbs_info->requested_freq == policy->min)
> 
> What about dropping these `if` blocks completely ? i.e. always call
> __cpufreq_driver_target().
> 
> __cpufreq_driver_target() already have similar checks in place to optimize
> unnecessary freq changes. We don't really need callers to do the same.
> 

Thanks Viresh a lot,

Nice suggestion! — I think it aligns well with the general direction of
addressing this issue and relying more on the cpufreq core.

That said, I have a concern if we drop the decrease-path early-exit
check while keeping the existing condition:

         if (requested_freq > freq_step)

unchanged. In that case, when the system is already at policy->min, this
appears to introduce a governor-level oscillation, even though the
effective hardware frequency remains unchanged:

Consider the following scenario
(policy->min = 200, freq_step = 100, hardware already at min, low load):

   Iteration N:
     requested_freq = 200
     decrease path: 200 > 100 -> requested_freq = 100
     __cpufreq_driver_target(100, LE):
         clamp -> 200, target == cur
         -> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
         -> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
     dbs_info->requested_freq = 100 <- below min

   Iteration N+1:
     requested_freq = 100
     out-of-range check: reset to policy->cur = 200
     decrease path: 200 > 100 -> requested_freq = 100
     dbs_info->requested_freq = 100   <- below min again

This results in a repeated oscillation pattern. For drivers with
CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
the driver may be invoked every sampling period even though the
effective frequency does not change. I'm happy to defer to your judgment
on whether this is significant enough to warrant further changes.

Given this, may i know would it make sense to adjust only the decrease
path early-exit condition to use dbs_info->requested_freq (i.e. the last
target actually requested from hardware), rather than the idle-adjusted
local requested_freq?

   -     if (requested_freq == policy->min)
   +     if (dbs_info->requested_freq == policy->min)
             goto out;

With this change, the scenario raised by Lifeng
(dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
follows:

   Iteration 1:
     requested_freq = 400
     idle decay: requested_freq = 200
     if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
     200 > 100 -> requested_freq = 100
     __cpufreq_driver_target(100, LE):
         clamp -> 200, target(200) != cur(400)
         -> driver invoked, hardware: 400 -> 200 MHz   [bug fixed]
     dbs_info->requested_freq = 100   <- one-time transient value

   Iteration 2:
     requested_freq = 100
     out-of-range check: reset to policy->cur = 200
     if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out

   -> Stable from iteration 2 onward, with no extra driver calls.

There is a one-time transient where dbs_info->requested_freq briefly
drops below policy->min, but this is harmless: the existing out-of-range
check corrects it in the next iteration. A similar situation was
discussed before [1], where the conclusion at the time was that
__cpufreq_driver_target() already performs the necessary clamping. If it
would be clearer or more robust to add an explicit guard
here, this can be revisited as well at your discretion.😊

[1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/

The increase-path early-exit is intentionally left unchanged, to avoid
altering the behavior in the scenario where conservative idle decay
would otherwise be ignored when the previous requested frequency was
already at policy->max.


-- 
Thx and BRs,
Zhongqiu Han

  reply	other threads:[~2026-05-08  6:47 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-21 12:35 [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target Lifeng Zheng
2026-04-22  8:06 ` Stratos Karafotis
2026-04-22  8:36   ` zhenglifeng (A)
2026-04-23  5:39 ` Zhongqiu Han
2026-04-23  7:12   ` zhenglifeng (A)
2026-05-07  9:33     ` Viresh Kumar
2026-05-08  6:46       ` Zhongqiu Han [this message]
2026-05-08 10:14         ` Viresh Kumar
2026-05-09  2:01           ` zhenglifeng (A)
2026-05-11  4:03           ` Zhongqiu Han
2026-05-09  1:54       ` zhenglifeng (A)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80d32bdb-f661-44cb-b529-3d5ce2142af0@oss.qualcomm.com \
    --to=zhongqiu.han@oss.qualcomm.com \
    --cc=lihuisong@huawei.com \
    --cc=linhongye@h-partners.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=rafael@kernel.org \
    --cc=stratosk@semaphore.gr \
    --cc=viresh.kumar@linaro.org \
    --cc=wangzhi12@huawei.com \
    --cc=yubowen8@huawei.com \
    --cc=zhangpengjie2@huawei.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox