Linux Power Management development
 help / color / mirror / Atom feed
* [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
@ 2026-04-21 12:35 Lifeng Zheng
  2026-04-22  8:06 ` Stratos Karafotis
  2026-04-23  5:39 ` Zhongqiu Han
  0 siblings, 2 replies; 11+ messages in thread
From: Lifeng Zheng @ 2026-04-21 12:35 UTC (permalink / raw)
  To: rafael, viresh.kumar, stratosk
  Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
	zhangpengjie2, wangzhi12, linhongye, zhenglifeng1

In cs_dbs_update(), the requested frequency is decremented by one freq_step
for each idle period. However, this can cause divergence between
'requested_freq' (target for current update) and 'dbs_info->requested_freq'
(target from previous update).

When the load crosses up_threshold or down_threshold, the decision on
whether to increase or decrease frequency should be based on the *previous*
target (dbs_info->requested_freq), not the current one. Otherwise, the
update step may be skipped entirely if the current target has already hit a
boundary due to prior adjustments.

Ensure that frequency scaling decisions are made using the correct
historical target, fixing cases where frequency fails to decrease despite
sustained idle periods.

Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
---
 drivers/cpufreq/cpufreq_conservative.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index df01d33993d8..f3c3b54e4bf8 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
 		dbs_info->down_skip = 0;
 
 		/* if we are already at full speed then break out early */
-		if (requested_freq == policy->max)
+		if (dbs_info->requested_freq == policy->max)
 			goto out;
 
 		requested_freq += freq_step;
@@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
 		/*
 		 * if we cannot reduce the frequency anymore, break out early
 		 */
-		if (requested_freq == policy->min)
+		if (dbs_info->requested_freq == policy->min)
 			goto out;
 
 		if (requested_freq > freq_step)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-04-21 12:35 [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target Lifeng Zheng
@ 2026-04-22  8:06 ` Stratos Karafotis
  2026-04-22  8:36   ` zhenglifeng (A)
  2026-04-23  5:39 ` Zhongqiu Han
  1 sibling, 1 reply; 11+ messages in thread
From: Stratos Karafotis @ 2026-04-22  8:06 UTC (permalink / raw)
  To: Lifeng Zheng, rafael, viresh.kumar
  Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
	zhangpengjie2, wangzhi12, linhongye

Hi all!

I was struggling to see your point, but I think you are right.

The requested_freq could be equal to policy->min (due to idle periods)
while the dbs_info->requested_freq could be greater than policy->min.
So, in this case it breaks out early without the chance to further 
reduce the frequency, correct?

Stratos Karafotis

On 4/21/26 15:35, Lifeng Zheng wrote:
> In cs_dbs_update(), the requested frequency is decremented by one freq_step
> for each idle period. However, this can cause divergence between
> 'requested_freq' (target for current update) and 'dbs_info->requested_freq'
> (target from previous update).
> 
> When the load crosses up_threshold or down_threshold, the decision on
> whether to increase or decrease frequency should be based on the *previous*
> target (dbs_info->requested_freq), not the current one. Otherwise, the
> update step may be skipped entirely if the current target has already hit a
> boundary due to prior adjustments.
> 
> Ensure that frequency scaling decisions are made using the correct
> historical target, fixing cases where frequency fails to decrease despite
> sustained idle periods.
> 
> Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
> Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
> ---
>   drivers/cpufreq/cpufreq_conservative.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..f3c3b54e4bf8 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>   		dbs_info->down_skip = 0;
>   
>   		/* if we are already at full speed then break out early */
> -		if (requested_freq == policy->max)
> +		if (dbs_info->requested_freq == policy->max)
>   			goto out;
>   
>   		requested_freq += freq_step;
> @@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>   		/*
>   		 * if we cannot reduce the frequency anymore, break out early
>   		 */
> -		if (requested_freq == policy->min)
> +		if (dbs_info->requested_freq == policy->min)
>   			goto out;
>   
>   		if (requested_freq > freq_step)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-04-22  8:06 ` Stratos Karafotis
@ 2026-04-22  8:36   ` zhenglifeng (A)
  0 siblings, 0 replies; 11+ messages in thread
From: zhenglifeng (A) @ 2026-04-22  8:36 UTC (permalink / raw)
  To: Stratos Karafotis, rafael, viresh.kumar
  Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
	zhangpengjie2, wangzhi12, linhongye

On 4/22/2026 4:06 PM, Stratos Karafotis wrote:
> Hi all!
> 
> I was struggling to see your point, but I think you are right.
> 
> The requested_freq could be equal to policy->min (due to idle periods)
> while the dbs_info->requested_freq could be greater than policy->min.
> So, in this case it breaks out early without the chance to further reduce the frequency, correct?

Yes. This is the case where problems arise.

> 
> Stratos Karafotis
> 
> On 4/21/26 15:35, Lifeng Zheng wrote:
>> In cs_dbs_update(), the requested frequency is decremented by one freq_step
>> for each idle period. However, this can cause divergence between
>> 'requested_freq' (target for current update) and 'dbs_info->requested_freq'
>> (target from previous update).
>>
>> When the load crosses up_threshold or down_threshold, the decision on
>> whether to increase or decrease frequency should be based on the *previous*
>> target (dbs_info->requested_freq), not the current one. Otherwise, the
>> update step may be skipped entirely if the current target has already hit a
>> boundary due to prior adjustments.
>>
>> Ensure that frequency scaling decisions are made using the correct
>> historical target, fixing cases where frequency fails to decrease despite
>> sustained idle periods.
>>
>> Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
>> Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
>> ---
>>   drivers/cpufreq/cpufreq_conservative.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>> index df01d33993d8..f3c3b54e4bf8 100644
>> --- a/drivers/cpufreq/cpufreq_conservative.c
>> +++ b/drivers/cpufreq/cpufreq_conservative.c
>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>           dbs_info->down_skip = 0;
>>             /* if we are already at full speed then break out early */
>> -        if (requested_freq == policy->max)
>> +        if (dbs_info->requested_freq == policy->max)
>>               goto out;
>>             requested_freq += freq_step;
>> @@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>           /*
>>            * if we cannot reduce the frequency anymore, break out early
>>            */
>> -        if (requested_freq == policy->min)
>> +        if (dbs_info->requested_freq == policy->min)
>>               goto out;
>>             if (requested_freq > freq_step)
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-04-21 12:35 [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target Lifeng Zheng
  2026-04-22  8:06 ` Stratos Karafotis
@ 2026-04-23  5:39 ` Zhongqiu Han
  2026-04-23  7:12   ` zhenglifeng (A)
  1 sibling, 1 reply; 11+ messages in thread
From: Zhongqiu Han @ 2026-04-23  5:39 UTC (permalink / raw)
  To: Lifeng Zheng, rafael, viresh.kumar, stratosk
  Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
	zhangpengjie2, wangzhi12, linhongye, zhongqiu.han

On 4/21/2026 8:35 PM, Lifeng Zheng wrote:
> In cs_dbs_update(), the requested frequency is decremented by one freq_step
> for each idle period. However, this can cause divergence between
> 'requested_freq' (target for current update) and 'dbs_info->requested_freq'
> (target from previous update).
> 
> When the load crosses up_threshold or down_threshold, the decision on
> whether to increase or decrease frequency should be based on the *previous*
> target (dbs_info->requested_freq), not the current one. Otherwise, the
> update step may be skipped entirely if the current target has already hit a
> boundary due to prior adjustments.
> 
> Ensure that frequency scaling decisions are made using the correct
> historical target, fixing cases where frequency fails to decrease despite
> sustained idle periods.
> 
> Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
> Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>

Hi Lifeng,
Thanks for the patch.

May I know would this ignore conservative idle decay when the previous
requested frequency was policy->max?


Scenario: Increase path, previous target at max, with idle
compensation; the original code does not have the same behavior as the
current patch.

Initial state:
   policy->max               = 2000 MHz
   policy->min               = 200 MHz
   dbs_info->requested_freq  = 2000 MHz  (= policy->max)
   hardware frequency        = 2000 MHz
   idle_periods              = 2
   load                      = 90% (> up_threshold=80)

1.Original code
Step 1: requested_freq = dbs_info->requested_freq = 2000

Step 2: [idle_periods block]
         freq_steps = 2 * 100 = 200
         2000 > (200 + 200) = 400 ?  YES
         requested_freq = 2000 - 200 = 1800
Step 3: [increase path]
         if (requested_freq == policy->max)
           -> 1800 == 2000 ?  NO  -> fall through

Step 4: requested_freq += freq_step
         requested_freq = 1800 + 100 = 1900

Step 5: __cpufreq_driver_target(policy, 1900, HE) -> hardware = 1900 MHz
Step 6: dbs_info->requested_freq = 1900

Result: hardware 2000 -> *1900 MHz* (net 1-step decrease)


2.Current Patch
Step 1: requested_freq = 2000
Step 2: [idle_periods block] -> requested_freq = 1800
Step 3: if (dbs_info->requested_freq == policy->max)
           -> 2000 == 2000 ?  YES  -> goto out
Step 4: hardware stays at 2000 MHz, dbs_info->requested_freq stays at 2000

Result: hardware stays at *2000 MHz* (no change)




> ---
>   drivers/cpufreq/cpufreq_conservative.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..f3c3b54e4bf8 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>   		dbs_info->down_skip = 0;
>   
>   		/* if we are already at full speed then break out early */
> -		if (requested_freq == policy->max)
> +		if (dbs_info->requested_freq == policy->max)
>   			goto out;
>   
>   		requested_freq += freq_step;
> @@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>   		/*
>   		 * if we cannot reduce the frequency anymore, break out early
>   		 */
> -		if (requested_freq == policy->min)
> +		if (dbs_info->requested_freq == policy->min)
>   			goto out;
>   
>   		if (requested_freq > freq_step)


-- 
Thx and BRs,
Zhongqiu Han

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-04-23  5:39 ` Zhongqiu Han
@ 2026-04-23  7:12   ` zhenglifeng (A)
  2026-05-07  9:33     ` Viresh Kumar
  0 siblings, 1 reply; 11+ messages in thread
From: zhenglifeng (A) @ 2026-04-23  7:12 UTC (permalink / raw)
  To: Zhongqiu Han, rafael, viresh.kumar, stratosk
  Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
	zhangpengjie2, wangzhi12, linhongye

On 4/23/2026 1:39 PM, Zhongqiu Han wrote:
> On 4/21/2026 8:35 PM, Lifeng Zheng wrote:
>> In cs_dbs_update(), the requested frequency is decremented by one freq_step
>> for each idle period. However, this can cause divergence between
>> 'requested_freq' (target for current update) and 'dbs_info->requested_freq'
>> (target from previous update).
>>
>> When the load crosses up_threshold or down_threshold, the decision on
>> whether to increase or decrease frequency should be based on the *previous*
>> target (dbs_info->requested_freq), not the current one. Otherwise, the
>> update step may be skipped entirely if the current target has already hit a
>> boundary due to prior adjustments.
>>
>> Ensure that frequency scaling decisions are made using the correct
>> historical target, fixing cases where frequency fails to decrease despite
>> sustained idle periods.
>>
>> Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
>> Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
> 
> Hi Lifeng,
> Thanks for the patch.
> 
> May I know would this ignore conservative idle decay when the previous
> requested frequency was policy->max?
> 
> 
> Scenario: Increase path, previous target at max, with idle
> compensation; the original code does not have the same behavior as the
> current patch.
> 
> Initial state:
>   policy->max               = 2000 MHz
>   policy->min               = 200 MHz
>   dbs_info->requested_freq  = 2000 MHz  (= policy->max)
>   hardware frequency        = 2000 MHz
>   idle_periods              = 2
>   load                      = 90% (> up_threshold=80)
> 
> 1.Original code
> Step 1: requested_freq = dbs_info->requested_freq = 2000
> 
> Step 2: [idle_periods block]
>         freq_steps = 2 * 100 = 200
>         2000 > (200 + 200) = 400 ?  YES
>         requested_freq = 2000 - 200 = 1800
> Step 3: [increase path]
>         if (requested_freq == policy->max)
>           -> 1800 == 2000 ?  NO  -> fall through
> 
> Step 4: requested_freq += freq_step
>         requested_freq = 1800 + 100 = 1900
> 
> Step 5: __cpufreq_driver_target(policy, 1900, HE) -> hardware = 1900 MHz
> Step 6: dbs_info->requested_freq = 1900
> 
> Result: hardware 2000 -> *1900 MHz* (net 1-step decrease)
> 
> 
> 2.Current Patch
> Step 1: requested_freq = 2000
> Step 2: [idle_periods block] -> requested_freq = 1800
> Step 3: if (dbs_info->requested_freq == policy->max)
>           -> 2000 == 2000 ?  YES  -> goto out
> Step 4: hardware stays at 2000 MHz, dbs_info->requested_freq stays at 2000
> 
> Result: hardware stays at *2000 MHz* (no change)
> 
> 

Yes, I think you are right. The behaviors are not the same. I modified this
just in order to keep it consistent with the case exceeding down_threshold.
I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
Viresh could give us some advice.

> 
> 
>> ---
>>   drivers/cpufreq/cpufreq_conservative.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>> index df01d33993d8..f3c3b54e4bf8 100644
>> --- a/drivers/cpufreq/cpufreq_conservative.c
>> +++ b/drivers/cpufreq/cpufreq_conservative.c
>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>           dbs_info->down_skip = 0;
>>             /* if we are already at full speed then break out early */
>> -        if (requested_freq == policy->max)
>> +        if (dbs_info->requested_freq == policy->max)
>>               goto out;
>>             requested_freq += freq_step;
>> @@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>           /*
>>            * if we cannot reduce the frequency anymore, break out early
>>            */
>> -        if (requested_freq == policy->min)
>> +        if (dbs_info->requested_freq == policy->min)
>>               goto out;
>>             if (requested_freq > freq_step)
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-04-23  7:12   ` zhenglifeng (A)
@ 2026-05-07  9:33     ` Viresh Kumar
  2026-05-08  6:46       ` Zhongqiu Han
  2026-05-09  1:54       ` zhenglifeng (A)
  0 siblings, 2 replies; 11+ messages in thread
From: Viresh Kumar @ 2026-05-07  9:33 UTC (permalink / raw)
  To: zhenglifeng (A)
  Cc: Zhongqiu Han, rafael, stratosk, linux-pm, linux-kernel, linuxarm,
	zhanjie9, lihuisong, yubowen8, zhangpengjie2, wangzhi12,
	linhongye

On 23-04-26, 15:12, zhenglifeng (A) wrote:
> Yes, I think you are right. The behaviors are not the same. I modified this
> just in order to keep it consistent with the case exceeding down_threshold.
> I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
> Viresh could give us some advice.

> >> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> >> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> >>           dbs_info->down_skip = 0;
> >>             /* if we are already at full speed then break out early */
> >> -        if (requested_freq == policy->max)
> >> +        if (dbs_info->requested_freq == policy->max)
> >>               goto out;

> >> -        if (requested_freq == policy->min)
> >> +        if (dbs_info->requested_freq == policy->min)

What about dropping these `if` blocks completely ? i.e. always call
__cpufreq_driver_target().

__cpufreq_driver_target() already have similar checks in place to optimize
unnecessary freq changes. We don't really need callers to do the same.

-- 
viresh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-05-07  9:33     ` Viresh Kumar
@ 2026-05-08  6:46       ` Zhongqiu Han
  2026-05-08 10:14         ` Viresh Kumar
  2026-05-09  1:54       ` zhenglifeng (A)
  1 sibling, 1 reply; 11+ messages in thread
From: Zhongqiu Han @ 2026-05-08  6:46 UTC (permalink / raw)
  To: Viresh Kumar, zhenglifeng (A)
  Cc: rafael, stratosk, linux-pm, linux-kernel, linuxarm, zhanjie9,
	lihuisong, yubowen8, zhangpengjie2, wangzhi12, linhongye,
	zhongqiu.han

On 5/7/2026 5:33 PM, Viresh Kumar wrote:
> On 23-04-26, 15:12, zhenglifeng (A) wrote:
>> Yes, I think you are right. The behaviors are not the same. I modified this
>> just in order to keep it consistent with the case exceeding down_threshold.
>> I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
>> Viresh could give us some advice.
> 
>>>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>>>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>>>            dbs_info->down_skip = 0;
>>>>              /* if we are already at full speed then break out early */
>>>> -        if (requested_freq == policy->max)
>>>> +        if (dbs_info->requested_freq == policy->max)
>>>>                goto out;
> 
>>>> -        if (requested_freq == policy->min)
>>>> +        if (dbs_info->requested_freq == policy->min)
> 
> What about dropping these `if` blocks completely ? i.e. always call
> __cpufreq_driver_target().
> 
> __cpufreq_driver_target() already have similar checks in place to optimize
> unnecessary freq changes. We don't really need callers to do the same.
> 

Thanks Viresh a lot,

Nice suggestion! — I think it aligns well with the general direction of
addressing this issue and relying more on the cpufreq core.

That said, I have a concern if we drop the decrease-path early-exit
check while keeping the existing condition:

         if (requested_freq > freq_step)

unchanged. In that case, when the system is already at policy->min, this
appears to introduce a governor-level oscillation, even though the
effective hardware frequency remains unchanged:

Consider the following scenario
(policy->min = 200, freq_step = 100, hardware already at min, low load):

   Iteration N:
     requested_freq = 200
     decrease path: 200 > 100 -> requested_freq = 100
     __cpufreq_driver_target(100, LE):
         clamp -> 200, target == cur
         -> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
         -> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
     dbs_info->requested_freq = 100 <- below min

   Iteration N+1:
     requested_freq = 100
     out-of-range check: reset to policy->cur = 200
     decrease path: 200 > 100 -> requested_freq = 100
     dbs_info->requested_freq = 100   <- below min again

This results in a repeated oscillation pattern. For drivers with
CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
the driver may be invoked every sampling period even though the
effective frequency does not change. I'm happy to defer to your judgment
on whether this is significant enough to warrant further changes.

Given this, may i know would it make sense to adjust only the decrease
path early-exit condition to use dbs_info->requested_freq (i.e. the last
target actually requested from hardware), rather than the idle-adjusted
local requested_freq?

   -     if (requested_freq == policy->min)
   +     if (dbs_info->requested_freq == policy->min)
             goto out;

With this change, the scenario raised by Lifeng
(dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
follows:

   Iteration 1:
     requested_freq = 400
     idle decay: requested_freq = 200
     if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
     200 > 100 -> requested_freq = 100
     __cpufreq_driver_target(100, LE):
         clamp -> 200, target(200) != cur(400)
         -> driver invoked, hardware: 400 -> 200 MHz   [bug fixed]
     dbs_info->requested_freq = 100   <- one-time transient value

   Iteration 2:
     requested_freq = 100
     out-of-range check: reset to policy->cur = 200
     if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out

   -> Stable from iteration 2 onward, with no extra driver calls.

There is a one-time transient where dbs_info->requested_freq briefly
drops below policy->min, but this is harmless: the existing out-of-range
check corrects it in the next iteration. A similar situation was
discussed before [1], where the conclusion at the time was that
__cpufreq_driver_target() already performs the necessary clamping. If it
would be clearer or more robust to add an explicit guard
here, this can be revisited as well at your discretion.😊

[1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/

The increase-path early-exit is intentionally left unchanged, to avoid
altering the behavior in the scenario where conservative idle decay
would otherwise be ignored when the previous requested frequency was
already at policy->max.


-- 
Thx and BRs,
Zhongqiu Han

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-05-08  6:46       ` Zhongqiu Han
@ 2026-05-08 10:14         ` Viresh Kumar
  2026-05-09  2:01           ` zhenglifeng (A)
  2026-05-11  4:03           ` Zhongqiu Han
  0 siblings, 2 replies; 11+ messages in thread
From: Viresh Kumar @ 2026-05-08 10:14 UTC (permalink / raw)
  To: Zhongqiu Han
  Cc: zhenglifeng (A), rafael, stratosk, linux-pm, linux-kernel,
	linuxarm, zhanjie9, lihuisong, yubowen8, zhangpengjie2, wangzhi12,
	linhongye

On 08-05-26, 14:46, Zhongqiu Han wrote:
> Nice suggestion! — I think it aligns well with the general direction of
> addressing this issue and relying more on the cpufreq core.
> 
> That said, I have a concern if we drop the decrease-path early-exit
> check while keeping the existing condition:
> 
>         if (requested_freq > freq_step)
> 
> unchanged. In that case, when the system is already at policy->min, this
> appears to introduce a governor-level oscillation, even though the
> effective hardware frequency remains unchanged:
> 
> Consider the following scenario
> (policy->min = 200, freq_step = 100, hardware already at min, low load):
> 
>   Iteration N:
>     requested_freq = 200
>     decrease path: 200 > 100 -> requested_freq = 100
>     __cpufreq_driver_target(100, LE):
>         clamp -> 200, target == cur
>         -> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
>         -> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
>     dbs_info->requested_freq = 100 <- below min
> 
>   Iteration N+1:
>     requested_freq = 100
>     out-of-range check: reset to policy->cur = 200
>     decrease path: 200 > 100 -> requested_freq = 100
>     dbs_info->requested_freq = 100   <- below min again
> 
> This results in a repeated oscillation pattern. For drivers with
> CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
> the driver may be invoked every sampling period even though the
> effective frequency does not change. I'm happy to defer to your judgment
> on whether this is significant enough to warrant further changes.
> 
> Given this, may i know would it make sense to adjust only the decrease
> path early-exit condition to use dbs_info->requested_freq (i.e. the last
> target actually requested from hardware), rather than the idle-adjusted
> local requested_freq?
> 
>   -     if (requested_freq == policy->min)
>   +     if (dbs_info->requested_freq == policy->min)
>             goto out;
> 
> With this change, the scenario raised by Lifeng
> (dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
> follows:
> 
>   Iteration 1:
>     requested_freq = 400
>     idle decay: requested_freq = 200
>     if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
>     200 > 100 -> requested_freq = 100
>     __cpufreq_driver_target(100, LE):
>         clamp -> 200, target(200) != cur(400)
>         -> driver invoked, hardware: 400 -> 200 MHz   [bug fixed]
>     dbs_info->requested_freq = 100   <- one-time transient value
> 
>   Iteration 2:
>     requested_freq = 100
>     out-of-range check: reset to policy->cur = 200
>     if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out
> 
>   -> Stable from iteration 2 onward, with no extra driver calls.
> 
> There is a one-time transient where dbs_info->requested_freq briefly
> drops below policy->min, but this is harmless: the existing out-of-range
> check corrects it in the next iteration. A similar situation was
> discussed before [1], where the conclusion at the time was that
> __cpufreq_driver_target() already performs the necessary clamping. If it
> would be clearer or more robust to add an explicit guard
> here, this can be revisited as well at your discretion.😊
> 
> [1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/
> 
> The increase-path early-exit is intentionally left unchanged, to avoid
> altering the behavior in the scenario where conservative idle decay
> would otherwise be ignored when the previous requested frequency was
> already at policy->max.

What about this ?

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index df01d33993d8..c7ec11de9a43 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -103,10 +103,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
        if (load > dbs_data->up_threshold) {
                dbs_info->down_skip = 0;

-               /* if we are already at full speed then break out early */
-               if (requested_freq == policy->max)
-                       goto out;
-
                requested_freq += freq_step;
                if (requested_freq > policy->max)
                        requested_freq = policy->max;
@@ -124,15 +120,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)

        /* Check for frequency decrease */
        if (load < cs_tuners->down_threshold) {
-               /*
-                * if we cannot reduce the frequency anymore, break out early
-                */
-               if (requested_freq == policy->min)
-                       goto out;
-
-               if (requested_freq > freq_step)
-                       requested_freq -= freq_step;
-               else
+               requested_freq -= freq_step;
+               if (requested_freq < policy->min)
                        requested_freq = policy->min;

                __cpufreq_driver_target(policy, requested_freq,

-- 
viresh

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-05-07  9:33     ` Viresh Kumar
  2026-05-08  6:46       ` Zhongqiu Han
@ 2026-05-09  1:54       ` zhenglifeng (A)
  1 sibling, 0 replies; 11+ messages in thread
From: zhenglifeng (A) @ 2026-05-09  1:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Zhongqiu Han, rafael, stratosk, linux-pm, linux-kernel, linuxarm,
	zhanjie9, lihuisong, yubowen8, zhangpengjie2, wangzhi12,
	linhongye

On 5/7/2026 5:33 PM, Viresh Kumar wrote:
> On 23-04-26, 15:12, zhenglifeng (A) wrote:
>> Yes, I think you are right. The behaviors are not the same. I modified this
>> just in order to keep it consistent with the case exceeding down_threshold.
>> I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
>> Viresh could give us some advice.
> 
>>>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>>>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>>>           dbs_info->down_skip = 0;
>>>>             /* if we are already at full speed then break out early */
>>>> -        if (requested_freq == policy->max)
>>>> +        if (dbs_info->requested_freq == policy->max)
>>>>               goto out;
> 
>>>> -        if (requested_freq == policy->min)
>>>> +        if (dbs_info->requested_freq == policy->min)
> 
> What about dropping these `if` blocks completely ? i.e. always call
> __cpufreq_driver_target().
> 
> __cpufreq_driver_target() already have similar checks in place to optimize
> unnecessary freq changes. We don't really need callers to do the same.
> 

__cpufreq_driver_target() only skips this when !(cpufreq_driver->flags & CPUFREQ_NEED_UPDATE_LIMITS),
this means that for cppc_cpufreq, amd-pstate and intel_pstate the
unnecessary freq change will still be called.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-05-08 10:14         ` Viresh Kumar
@ 2026-05-09  2:01           ` zhenglifeng (A)
  2026-05-11  4:03           ` Zhongqiu Han
  1 sibling, 0 replies; 11+ messages in thread
From: zhenglifeng (A) @ 2026-05-09  2:01 UTC (permalink / raw)
  To: Viresh Kumar, Zhongqiu Han
  Cc: rafael, stratosk, linux-pm, linux-kernel, linuxarm, zhanjie9,
	lihuisong, yubowen8, zhangpengjie2, wangzhi12, linhongye

On 5/8/2026 6:14 PM, Viresh Kumar wrote:
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..c7ec11de9a43 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -103,10 +103,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>         if (load > dbs_data->up_threshold) {
>                 dbs_info->down_skip = 0;
> 
> -               /* if we are already at full speed then break out early */
> -               if (requested_freq == policy->max)
> -                       goto out;
> -
>                 requested_freq += freq_step;
>                 if (requested_freq > policy->max)
>                         requested_freq = policy->max;
> @@ -124,15 +120,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> 
>         /* Check for frequency decrease */
>         if (load < cs_tuners->down_threshold) {
> -               /*
> -                * if we cannot reduce the frequency anymore, break out early
> -                */
> -               if (requested_freq == policy->min)
> -                       goto out;
> -
> -               if (requested_freq > freq_step)
> -                       requested_freq -= freq_step;
> -               else
> +               requested_freq -= freq_step;

Small probability but it is still possible that freq_step is larger than
requested_freq.

> +               if (requested_freq < policy->min)
>                         requested_freq = policy->min;
> 
>                 __cpufreq_driver_target(policy, requested_freq,
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
  2026-05-08 10:14         ` Viresh Kumar
  2026-05-09  2:01           ` zhenglifeng (A)
@ 2026-05-11  4:03           ` Zhongqiu Han
  1 sibling, 0 replies; 11+ messages in thread
From: Zhongqiu Han @ 2026-05-11  4:03 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: zhenglifeng (A), rafael, stratosk, linux-pm, linux-kernel,
	linuxarm, zhanjie9, lihuisong, yubowen8, zhangpengjie2, wangzhi12,
	linhongye, zhongqiu.han

On 5/8/2026 6:14 PM, Viresh Kumar wrote:
> On 08-05-26, 14:46, Zhongqiu Han wrote:
>> Nice suggestion! — I think it aligns well with the general direction of
>> addressing this issue and relying more on the cpufreq core.
>>
>> That said, I have a concern if we drop the decrease-path early-exit
>> check while keeping the existing condition:
>>
>>          if (requested_freq > freq_step)
>>
>> unchanged. In that case, when the system is already at policy->min, this
>> appears to introduce a governor-level oscillation, even though the
>> effective hardware frequency remains unchanged:
>>
>> Consider the following scenario
>> (policy->min = 200, freq_step = 100, hardware already at min, low load):
>>
>>    Iteration N:
>>      requested_freq = 200
>>      decrease path: 200 > 100 -> requested_freq = 100
>>      __cpufreq_driver_target(100, LE):
>>          clamp -> 200, target == cur
>>          -> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
>>          -> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
>>      dbs_info->requested_freq = 100 <- below min
>>
>>    Iteration N+1:
>>      requested_freq = 100
>>      out-of-range check: reset to policy->cur = 200
>>      decrease path: 200 > 100 -> requested_freq = 100
>>      dbs_info->requested_freq = 100   <- below min again
>>
>> This results in a repeated oscillation pattern. For drivers with
>> CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
>> the driver may be invoked every sampling period even though the
>> effective frequency does not change. I'm happy to defer to your judgment
>> on whether this is significant enough to warrant further changes.
>>
>> Given this, may i know would it make sense to adjust only the decrease
>> path early-exit condition to use dbs_info->requested_freq (i.e. the last
>> target actually requested from hardware), rather than the idle-adjusted
>> local requested_freq?
>>
>>    -     if (requested_freq == policy->min)
>>    +     if (dbs_info->requested_freq == policy->min)
>>              goto out;
>>
>> With this change, the scenario raised by Lifeng
>> (dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
>> follows:
>>
>>    Iteration 1:
>>      requested_freq = 400
>>      idle decay: requested_freq = 200
>>      if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
>>      200 > 100 -> requested_freq = 100
>>      __cpufreq_driver_target(100, LE):
>>          clamp -> 200, target(200) != cur(400)
>>          -> driver invoked, hardware: 400 -> 200 MHz   [bug fixed]
>>      dbs_info->requested_freq = 100   <- one-time transient value
>>
>>    Iteration 2:
>>      requested_freq = 100
>>      out-of-range check: reset to policy->cur = 200
>>      if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out
>>
>>    -> Stable from iteration 2 onward, with no extra driver calls.
>>
>> There is a one-time transient where dbs_info->requested_freq briefly
>> drops below policy->min, but this is harmless: the existing out-of-range
>> check corrects it in the next iteration. A similar situation was
>> discussed before [1], where the conclusion at the time was that
>> __cpufreq_driver_target() already performs the necessary clamping. If it
>> would be clearer or more robust to add an explicit guard
>> here, this can be revisited as well at your discretion.😊
>>
>> [1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/
>>
>> The increase-path early-exit is intentionally left unchanged, to avoid
>> altering the behavior in the scenario where conservative idle decay
>> would otherwise be ignored when the previous requested frequency was
>> already at policy->max.
> 
> What about this ?
> 
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..c7ec11de9a43 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -103,10 +103,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>          if (load > dbs_data->up_threshold) {
>                  dbs_info->down_skip = 0;
> 
> -               /* if we are already at full speed then break out early */
> -               if (requested_freq == policy->max)
> -                       goto out;
> -
>                  requested_freq += freq_step;
>                  if (requested_freq > policy->max)
>                          requested_freq = policy->max;
> @@ -124,15 +120,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> 
>          /* Check for frequency decrease */
>          if (load < cs_tuners->down_threshold) {
> -               /*
> -                * if we cannot reduce the frequency anymore, break out early
> -                */
> -               if (requested_freq == policy->min)
> -                       goto out;
> -
> -               if (requested_freq > freq_step)
> -                       requested_freq -= freq_step;
> -               else
> +               requested_freq -= freq_step;
> +               if (requested_freq < policy->min)
>                          requested_freq = policy->min;
> 
>                  __cpufreq_driver_target(policy, requested_freq,
> 

Just one small concern is that requested_freq may underflow as an
unsigned value; perhaps this could be improved by:

-		if (requested_freq > freq_step)
+		if (requested_freq > policy->min + freq_step)
  			requested_freq -= freq_step;
  		else
  			requested_freq = policy->min;

or use min() func?

Additionally, it seems that dropping the early‑exit checks also appears
to be a nice side effect fix for CPUFREQ_NEED_UPDATE_LIMITS drivers when
updating internal upper and lower frequency boundaries.

As designed in commit 1c534352f47f ("cpufreq: Introduce
CPUFREQ_NEED_UPDATE_LIMITS driver flag"), a cpufreq driver may need to
update its internal frequency limits when policy min or max changes for
drivers setting CPUFREQ_NEED_UPDATE_LIMITS.

However, the early‑exit in cs_dbs_update() can prevent
__cpufreq_driver_target() from ever being called.

For example, when policy->min rises from 200 to 400 kHz while policy
->cur is already at 400 kHz, under low load:

   cpufreq_policy_apply_limits():
     policy->min(400) > policy->cur(400) ?  NO -> driver not called

   cs_limits():
     dbs_info->requested_freq = policy->cur = 400

   [next sampling period, low load]
   cs_dbs_update():
     requested_freq = 400
     if (requested_freq == policy->min)  /* 400 == 400 -> true */
         goto out;  /* __cpufreq_driver_target() never called */

With the early‑exit removed, __cpufreq_driver_target() is called with
target == policy->cur, and CPUFREQ_NEED_UPDATE_LIMITS ensures the driver
is invoked to update its internal performance boundaries.


-- 
Thx and BRs,
Zhongqiu Han

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-05-11  4:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21 12:35 [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target Lifeng Zheng
2026-04-22  8:06 ` Stratos Karafotis
2026-04-22  8:36   ` zhenglifeng (A)
2026-04-23  5:39 ` Zhongqiu Han
2026-04-23  7:12   ` zhenglifeng (A)
2026-05-07  9:33     ` Viresh Kumar
2026-05-08  6:46       ` Zhongqiu Han
2026-05-08 10:14         ` Viresh Kumar
2026-05-09  2:01           ` zhenglifeng (A)
2026-05-11  4:03           ` Zhongqiu Han
2026-05-09  1:54       ` zhenglifeng (A)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox