* [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
@ 2026-04-21 12:35 Lifeng Zheng
2026-04-22 8:06 ` Stratos Karafotis
2026-04-23 5:39 ` Zhongqiu Han
0 siblings, 2 replies; 11+ messages in thread
From: Lifeng Zheng @ 2026-04-21 12:35 UTC (permalink / raw)
To: rafael, viresh.kumar, stratosk
Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
zhangpengjie2, wangzhi12, linhongye, zhenglifeng1
In cs_dbs_update(), the requested frequency is decremented by one freq_step
for each idle period. However, this can cause divergence between
'requested_freq' (target for current update) and 'dbs_info->requested_freq'
(target from previous update).
When the load crosses up_threshold or down_threshold, the decision on
whether to increase or decrease frequency should be based on the *previous*
target (dbs_info->requested_freq), not the current one. Otherwise, the
update step may be skipped entirely if the current target has already hit a
boundary due to prior adjustments.
Ensure that frequency scaling decisions are made using the correct
historical target, fixing cases where frequency fails to decrease despite
sustained idle periods.
Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
---
drivers/cpufreq/cpufreq_conservative.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index df01d33993d8..f3c3b54e4bf8 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
dbs_info->down_skip = 0;
/* if we are already at full speed then break out early */
- if (requested_freq == policy->max)
+ if (dbs_info->requested_freq == policy->max)
goto out;
requested_freq += freq_step;
@@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
/*
* if we cannot reduce the frequency anymore, break out early
*/
- if (requested_freq == policy->min)
+ if (dbs_info->requested_freq == policy->min)
goto out;
if (requested_freq > freq_step)
--
2.33.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-04-21 12:35 [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target Lifeng Zheng
@ 2026-04-22 8:06 ` Stratos Karafotis
2026-04-22 8:36 ` zhenglifeng (A)
2026-04-23 5:39 ` Zhongqiu Han
1 sibling, 1 reply; 11+ messages in thread
From: Stratos Karafotis @ 2026-04-22 8:06 UTC (permalink / raw)
To: Lifeng Zheng, rafael, viresh.kumar
Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
zhangpengjie2, wangzhi12, linhongye
Hi all!
I was struggling to see your point, but I think you are right.
The requested_freq could be equal to policy->min (due to idle periods)
while the dbs_info->requested_freq could be greater than policy->min.
So, in this case it breaks out early without the chance to further
reduce the frequency, correct?
Stratos Karafotis
On 4/21/26 15:35, Lifeng Zheng wrote:
> In cs_dbs_update(), the requested frequency is decremented by one freq_step
> for each idle period. However, this can cause divergence between
> 'requested_freq' (target for current update) and 'dbs_info->requested_freq'
> (target from previous update).
>
> When the load crosses up_threshold or down_threshold, the decision on
> whether to increase or decrease frequency should be based on the *previous*
> target (dbs_info->requested_freq), not the current one. Otherwise, the
> update step may be skipped entirely if the current target has already hit a
> boundary due to prior adjustments.
>
> Ensure that frequency scaling decisions are made using the correct
> historical target, fixing cases where frequency fails to decrease despite
> sustained idle periods.
>
> Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
> Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
> ---
> drivers/cpufreq/cpufreq_conservative.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..f3c3b54e4bf8 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> dbs_info->down_skip = 0;
>
> /* if we are already at full speed then break out early */
> - if (requested_freq == policy->max)
> + if (dbs_info->requested_freq == policy->max)
> goto out;
>
> requested_freq += freq_step;
> @@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> /*
> * if we cannot reduce the frequency anymore, break out early
> */
> - if (requested_freq == policy->min)
> + if (dbs_info->requested_freq == policy->min)
> goto out;
>
> if (requested_freq > freq_step)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-04-22 8:06 ` Stratos Karafotis
@ 2026-04-22 8:36 ` zhenglifeng (A)
0 siblings, 0 replies; 11+ messages in thread
From: zhenglifeng (A) @ 2026-04-22 8:36 UTC (permalink / raw)
To: Stratos Karafotis, rafael, viresh.kumar
Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
zhangpengjie2, wangzhi12, linhongye
On 4/22/2026 4:06 PM, Stratos Karafotis wrote:
> Hi all!
>
> I was struggling to see your point, but I think you are right.
>
> The requested_freq could be equal to policy->min (due to idle periods)
> while the dbs_info->requested_freq could be greater than policy->min.
> So, in this case it breaks out early without the chance to further reduce the frequency, correct?
Yes. This is the case where problems arise.
>
> Stratos Karafotis
>
> On 4/21/26 15:35, Lifeng Zheng wrote:
>> In cs_dbs_update(), the requested frequency is decremented by one freq_step
>> for each idle period. However, this can cause divergence between
>> 'requested_freq' (target for current update) and 'dbs_info->requested_freq'
>> (target from previous update).
>>
>> When the load crosses up_threshold or down_threshold, the decision on
>> whether to increase or decrease frequency should be based on the *previous*
>> target (dbs_info->requested_freq), not the current one. Otherwise, the
>> update step may be skipped entirely if the current target has already hit a
>> boundary due to prior adjustments.
>>
>> Ensure that frequency scaling decisions are made using the correct
>> historical target, fixing cases where frequency fails to decrease despite
>> sustained idle periods.
>>
>> Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
>> Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
>> ---
>> drivers/cpufreq/cpufreq_conservative.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>> index df01d33993d8..f3c3b54e4bf8 100644
>> --- a/drivers/cpufreq/cpufreq_conservative.c
>> +++ b/drivers/cpufreq/cpufreq_conservative.c
>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>> dbs_info->down_skip = 0;
>> /* if we are already at full speed then break out early */
>> - if (requested_freq == policy->max)
>> + if (dbs_info->requested_freq == policy->max)
>> goto out;
>> requested_freq += freq_step;
>> @@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>> /*
>> * if we cannot reduce the frequency anymore, break out early
>> */
>> - if (requested_freq == policy->min)
>> + if (dbs_info->requested_freq == policy->min)
>> goto out;
>> if (requested_freq > freq_step)
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-04-21 12:35 [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target Lifeng Zheng
2026-04-22 8:06 ` Stratos Karafotis
@ 2026-04-23 5:39 ` Zhongqiu Han
2026-04-23 7:12 ` zhenglifeng (A)
1 sibling, 1 reply; 11+ messages in thread
From: Zhongqiu Han @ 2026-04-23 5:39 UTC (permalink / raw)
To: Lifeng Zheng, rafael, viresh.kumar, stratosk
Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
zhangpengjie2, wangzhi12, linhongye, zhongqiu.han
On 4/21/2026 8:35 PM, Lifeng Zheng wrote:
> In cs_dbs_update(), the requested frequency is decremented by one freq_step
> for each idle period. However, this can cause divergence between
> 'requested_freq' (target for current update) and 'dbs_info->requested_freq'
> (target from previous update).
>
> When the load crosses up_threshold or down_threshold, the decision on
> whether to increase or decrease frequency should be based on the *previous*
> target (dbs_info->requested_freq), not the current one. Otherwise, the
> update step may be skipped entirely if the current target has already hit a
> boundary due to prior adjustments.
>
> Ensure that frequency scaling decisions are made using the correct
> historical target, fixing cases where frequency fails to decrease despite
> sustained idle periods.
>
> Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
> Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Hi Lifeng,
Thanks for the patch.
May I know would this ignore conservative idle decay when the previous
requested frequency was policy->max?
Scenario: Increase path, previous target at max, with idle
compensation; the original code does not have the same behavior as the
current patch.
Initial state:
policy->max = 2000 MHz
policy->min = 200 MHz
dbs_info->requested_freq = 2000 MHz (= policy->max)
hardware frequency = 2000 MHz
idle_periods = 2
load = 90% (> up_threshold=80)
1.Original code
Step 1: requested_freq = dbs_info->requested_freq = 2000
Step 2: [idle_periods block]
freq_steps = 2 * 100 = 200
2000 > (200 + 200) = 400 ? YES
requested_freq = 2000 - 200 = 1800
Step 3: [increase path]
if (requested_freq == policy->max)
-> 1800 == 2000 ? NO -> fall through
Step 4: requested_freq += freq_step
requested_freq = 1800 + 100 = 1900
Step 5: __cpufreq_driver_target(policy, 1900, HE) -> hardware = 1900 MHz
Step 6: dbs_info->requested_freq = 1900
Result: hardware 2000 -> *1900 MHz* (net 1-step decrease)
2.Current Patch
Step 1: requested_freq = 2000
Step 2: [idle_periods block] -> requested_freq = 1800
Step 3: if (dbs_info->requested_freq == policy->max)
-> 2000 == 2000 ? YES -> goto out
Step 4: hardware stays at 2000 MHz, dbs_info->requested_freq stays at 2000
Result: hardware stays at *2000 MHz* (no change)
> ---
> drivers/cpufreq/cpufreq_conservative.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..f3c3b54e4bf8 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> dbs_info->down_skip = 0;
>
> /* if we are already at full speed then break out early */
> - if (requested_freq == policy->max)
> + if (dbs_info->requested_freq == policy->max)
> goto out;
>
> requested_freq += freq_step;
> @@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> /*
> * if we cannot reduce the frequency anymore, break out early
> */
> - if (requested_freq == policy->min)
> + if (dbs_info->requested_freq == policy->min)
> goto out;
>
> if (requested_freq > freq_step)
--
Thx and BRs,
Zhongqiu Han
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-04-23 5:39 ` Zhongqiu Han
@ 2026-04-23 7:12 ` zhenglifeng (A)
2026-05-07 9:33 ` Viresh Kumar
0 siblings, 1 reply; 11+ messages in thread
From: zhenglifeng (A) @ 2026-04-23 7:12 UTC (permalink / raw)
To: Zhongqiu Han, rafael, viresh.kumar, stratosk
Cc: linux-pm, linux-kernel, linuxarm, zhanjie9, lihuisong, yubowen8,
zhangpengjie2, wangzhi12, linhongye
On 4/23/2026 1:39 PM, Zhongqiu Han wrote:
> On 4/21/2026 8:35 PM, Lifeng Zheng wrote:
>> In cs_dbs_update(), the requested frequency is decremented by one freq_step
>> for each idle period. However, this can cause divergence between
>> 'requested_freq' (target for current update) and 'dbs_info->requested_freq'
>> (target from previous update).
>>
>> When the load crosses up_threshold or down_threshold, the decision on
>> whether to increase or decrease frequency should be based on the *previous*
>> target (dbs_info->requested_freq), not the current one. Otherwise, the
>> update step may be skipped entirely if the current target has already hit a
>> boundary due to prior adjustments.
>>
>> Ensure that frequency scaling decisions are made using the correct
>> historical target, fixing cases where frequency fails to decrease despite
>> sustained idle periods.
>>
>> Fixes: 00bfe05889e9 ("cpufreq: conservative: Decrease frequency faster for deferred updates")
>> Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
>
> Hi Lifeng,
> Thanks for the patch.
>
> May I know would this ignore conservative idle decay when the previous
> requested frequency was policy->max?
>
>
> Scenario: Increase path, previous target at max, with idle
> compensation; the original code does not have the same behavior as the
> current patch.
>
> Initial state:
> policy->max = 2000 MHz
> policy->min = 200 MHz
> dbs_info->requested_freq = 2000 MHz (= policy->max)
> hardware frequency = 2000 MHz
> idle_periods = 2
> load = 90% (> up_threshold=80)
>
> 1.Original code
> Step 1: requested_freq = dbs_info->requested_freq = 2000
>
> Step 2: [idle_periods block]
> freq_steps = 2 * 100 = 200
> 2000 > (200 + 200) = 400 ? YES
> requested_freq = 2000 - 200 = 1800
> Step 3: [increase path]
> if (requested_freq == policy->max)
> -> 1800 == 2000 ? NO -> fall through
>
> Step 4: requested_freq += freq_step
> requested_freq = 1800 + 100 = 1900
>
> Step 5: __cpufreq_driver_target(policy, 1900, HE) -> hardware = 1900 MHz
> Step 6: dbs_info->requested_freq = 1900
>
> Result: hardware 2000 -> *1900 MHz* (net 1-step decrease)
>
>
> 2.Current Patch
> Step 1: requested_freq = 2000
> Step 2: [idle_periods block] -> requested_freq = 1800
> Step 3: if (dbs_info->requested_freq == policy->max)
> -> 2000 == 2000 ? YES -> goto out
> Step 4: hardware stays at 2000 MHz, dbs_info->requested_freq stays at 2000
>
> Result: hardware stays at *2000 MHz* (no change)
>
>
Yes, I think you are right. The behaviors are not the same. I modified this
just in order to keep it consistent with the case exceeding down_threshold.
I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
Viresh could give us some advice.
>
>
>> ---
>> drivers/cpufreq/cpufreq_conservative.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>> index df01d33993d8..f3c3b54e4bf8 100644
>> --- a/drivers/cpufreq/cpufreq_conservative.c
>> +++ b/drivers/cpufreq/cpufreq_conservative.c
>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>> dbs_info->down_skip = 0;
>> /* if we are already at full speed then break out early */
>> - if (requested_freq == policy->max)
>> + if (dbs_info->requested_freq == policy->max)
>> goto out;
>> requested_freq += freq_step;
>> @@ -127,7 +127,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>> /*
>> * if we cannot reduce the frequency anymore, break out early
>> */
>> - if (requested_freq == policy->min)
>> + if (dbs_info->requested_freq == policy->min)
>> goto out;
>> if (requested_freq > freq_step)
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-04-23 7:12 ` zhenglifeng (A)
@ 2026-05-07 9:33 ` Viresh Kumar
2026-05-08 6:46 ` Zhongqiu Han
2026-05-09 1:54 ` zhenglifeng (A)
0 siblings, 2 replies; 11+ messages in thread
From: Viresh Kumar @ 2026-05-07 9:33 UTC (permalink / raw)
To: zhenglifeng (A)
Cc: Zhongqiu Han, rafael, stratosk, linux-pm, linux-kernel, linuxarm,
zhanjie9, lihuisong, yubowen8, zhangpengjie2, wangzhi12,
linhongye
On 23-04-26, 15:12, zhenglifeng (A) wrote:
> Yes, I think you are right. The behaviors are not the same. I modified this
> just in order to keep it consistent with the case exceeding down_threshold.
> I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
> Viresh could give us some advice.
> >> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> >> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> >> dbs_info->down_skip = 0;
> >> /* if we are already at full speed then break out early */
> >> - if (requested_freq == policy->max)
> >> + if (dbs_info->requested_freq == policy->max)
> >> goto out;
> >> - if (requested_freq == policy->min)
> >> + if (dbs_info->requested_freq == policy->min)
What about dropping these `if` blocks completely ? i.e. always call
__cpufreq_driver_target().
__cpufreq_driver_target() already have similar checks in place to optimize
unnecessary freq changes. We don't really need callers to do the same.
--
viresh
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-05-07 9:33 ` Viresh Kumar
@ 2026-05-08 6:46 ` Zhongqiu Han
2026-05-08 10:14 ` Viresh Kumar
2026-05-09 1:54 ` zhenglifeng (A)
1 sibling, 1 reply; 11+ messages in thread
From: Zhongqiu Han @ 2026-05-08 6:46 UTC (permalink / raw)
To: Viresh Kumar, zhenglifeng (A)
Cc: rafael, stratosk, linux-pm, linux-kernel, linuxarm, zhanjie9,
lihuisong, yubowen8, zhangpengjie2, wangzhi12, linhongye,
zhongqiu.han
On 5/7/2026 5:33 PM, Viresh Kumar wrote:
> On 23-04-26, 15:12, zhenglifeng (A) wrote:
>> Yes, I think you are right. The behaviors are not the same. I modified this
>> just in order to keep it consistent with the case exceeding down_threshold.
>> I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
>> Viresh could give us some advice.
>
>>>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>>>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>>> dbs_info->down_skip = 0;
>>>> /* if we are already at full speed then break out early */
>>>> - if (requested_freq == policy->max)
>>>> + if (dbs_info->requested_freq == policy->max)
>>>> goto out;
>
>>>> - if (requested_freq == policy->min)
>>>> + if (dbs_info->requested_freq == policy->min)
>
> What about dropping these `if` blocks completely ? i.e. always call
> __cpufreq_driver_target().
>
> __cpufreq_driver_target() already have similar checks in place to optimize
> unnecessary freq changes. We don't really need callers to do the same.
>
Thanks Viresh a lot,
Nice suggestion! — I think it aligns well with the general direction of
addressing this issue and relying more on the cpufreq core.
That said, I have a concern if we drop the decrease-path early-exit
check while keeping the existing condition:
if (requested_freq > freq_step)
unchanged. In that case, when the system is already at policy->min, this
appears to introduce a governor-level oscillation, even though the
effective hardware frequency remains unchanged:
Consider the following scenario
(policy->min = 200, freq_step = 100, hardware already at min, low load):
Iteration N:
requested_freq = 200
decrease path: 200 > 100 -> requested_freq = 100
__cpufreq_driver_target(100, LE):
clamp -> 200, target == cur
-> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
-> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
dbs_info->requested_freq = 100 <- below min
Iteration N+1:
requested_freq = 100
out-of-range check: reset to policy->cur = 200
decrease path: 200 > 100 -> requested_freq = 100
dbs_info->requested_freq = 100 <- below min again
This results in a repeated oscillation pattern. For drivers with
CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
the driver may be invoked every sampling period even though the
effective frequency does not change. I'm happy to defer to your judgment
on whether this is significant enough to warrant further changes.
Given this, may i know would it make sense to adjust only the decrease
path early-exit condition to use dbs_info->requested_freq (i.e. the last
target actually requested from hardware), rather than the idle-adjusted
local requested_freq?
- if (requested_freq == policy->min)
+ if (dbs_info->requested_freq == policy->min)
goto out;
With this change, the scenario raised by Lifeng
(dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
follows:
Iteration 1:
requested_freq = 400
idle decay: requested_freq = 200
if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
200 > 100 -> requested_freq = 100
__cpufreq_driver_target(100, LE):
clamp -> 200, target(200) != cur(400)
-> driver invoked, hardware: 400 -> 200 MHz [bug fixed]
dbs_info->requested_freq = 100 <- one-time transient value
Iteration 2:
requested_freq = 100
out-of-range check: reset to policy->cur = 200
if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out
-> Stable from iteration 2 onward, with no extra driver calls.
There is a one-time transient where dbs_info->requested_freq briefly
drops below policy->min, but this is harmless: the existing out-of-range
check corrects it in the next iteration. A similar situation was
discussed before [1], where the conclusion at the time was that
__cpufreq_driver_target() already performs the necessary clamping. If it
would be clearer or more robust to add an explicit guard
here, this can be revisited as well at your discretion.😊
[1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/
The increase-path early-exit is intentionally left unchanged, to avoid
altering the behavior in the scenario where conservative idle decay
would otherwise be ignored when the previous requested frequency was
already at policy->max.
--
Thx and BRs,
Zhongqiu Han
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-05-08 6:46 ` Zhongqiu Han
@ 2026-05-08 10:14 ` Viresh Kumar
2026-05-09 2:01 ` zhenglifeng (A)
2026-05-11 4:03 ` Zhongqiu Han
0 siblings, 2 replies; 11+ messages in thread
From: Viresh Kumar @ 2026-05-08 10:14 UTC (permalink / raw)
To: Zhongqiu Han
Cc: zhenglifeng (A), rafael, stratosk, linux-pm, linux-kernel,
linuxarm, zhanjie9, lihuisong, yubowen8, zhangpengjie2, wangzhi12,
linhongye
On 08-05-26, 14:46, Zhongqiu Han wrote:
> Nice suggestion! — I think it aligns well with the general direction of
> addressing this issue and relying more on the cpufreq core.
>
> That said, I have a concern if we drop the decrease-path early-exit
> check while keeping the existing condition:
>
> if (requested_freq > freq_step)
>
> unchanged. In that case, when the system is already at policy->min, this
> appears to introduce a governor-level oscillation, even though the
> effective hardware frequency remains unchanged:
>
> Consider the following scenario
> (policy->min = 200, freq_step = 100, hardware already at min, low load):
>
> Iteration N:
> requested_freq = 200
> decrease path: 200 > 100 -> requested_freq = 100
> __cpufreq_driver_target(100, LE):
> clamp -> 200, target == cur
> -> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
> -> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
> dbs_info->requested_freq = 100 <- below min
>
> Iteration N+1:
> requested_freq = 100
> out-of-range check: reset to policy->cur = 200
> decrease path: 200 > 100 -> requested_freq = 100
> dbs_info->requested_freq = 100 <- below min again
>
> This results in a repeated oscillation pattern. For drivers with
> CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
> the driver may be invoked every sampling period even though the
> effective frequency does not change. I'm happy to defer to your judgment
> on whether this is significant enough to warrant further changes.
>
> Given this, may i know would it make sense to adjust only the decrease
> path early-exit condition to use dbs_info->requested_freq (i.e. the last
> target actually requested from hardware), rather than the idle-adjusted
> local requested_freq?
>
> - if (requested_freq == policy->min)
> + if (dbs_info->requested_freq == policy->min)
> goto out;
>
> With this change, the scenario raised by Lifeng
> (dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
> follows:
>
> Iteration 1:
> requested_freq = 400
> idle decay: requested_freq = 200
> if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
> 200 > 100 -> requested_freq = 100
> __cpufreq_driver_target(100, LE):
> clamp -> 200, target(200) != cur(400)
> -> driver invoked, hardware: 400 -> 200 MHz [bug fixed]
> dbs_info->requested_freq = 100 <- one-time transient value
>
> Iteration 2:
> requested_freq = 100
> out-of-range check: reset to policy->cur = 200
> if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out
>
> -> Stable from iteration 2 onward, with no extra driver calls.
>
> There is a one-time transient where dbs_info->requested_freq briefly
> drops below policy->min, but this is harmless: the existing out-of-range
> check corrects it in the next iteration. A similar situation was
> discussed before [1], where the conclusion at the time was that
> __cpufreq_driver_target() already performs the necessary clamping. If it
> would be clearer or more robust to add an explicit guard
> here, this can be revisited as well at your discretion.😊
>
> [1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/
>
> The increase-path early-exit is intentionally left unchanged, to avoid
> altering the behavior in the scenario where conservative idle decay
> would otherwise be ignored when the previous requested frequency was
> already at policy->max.
What about this ?
diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index df01d33993d8..c7ec11de9a43 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -103,10 +103,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
if (load > dbs_data->up_threshold) {
dbs_info->down_skip = 0;
- /* if we are already at full speed then break out early */
- if (requested_freq == policy->max)
- goto out;
-
requested_freq += freq_step;
if (requested_freq > policy->max)
requested_freq = policy->max;
@@ -124,15 +120,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
/* Check for frequency decrease */
if (load < cs_tuners->down_threshold) {
- /*
- * if we cannot reduce the frequency anymore, break out early
- */
- if (requested_freq == policy->min)
- goto out;
-
- if (requested_freq > freq_step)
- requested_freq -= freq_step;
- else
+ requested_freq -= freq_step;
+ if (requested_freq < policy->min)
requested_freq = policy->min;
__cpufreq_driver_target(policy, requested_freq,
--
viresh
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-05-07 9:33 ` Viresh Kumar
2026-05-08 6:46 ` Zhongqiu Han
@ 2026-05-09 1:54 ` zhenglifeng (A)
1 sibling, 0 replies; 11+ messages in thread
From: zhenglifeng (A) @ 2026-05-09 1:54 UTC (permalink / raw)
To: Viresh Kumar
Cc: Zhongqiu Han, rafael, stratosk, linux-pm, linux-kernel, linuxarm,
zhanjie9, lihuisong, yubowen8, zhangpengjie2, wangzhi12,
linhongye
On 5/7/2026 5:33 PM, Viresh Kumar wrote:
> On 23-04-26, 15:12, zhenglifeng (A) wrote:
>> Yes, I think you are right. The behaviors are not the same. I modified this
>> just in order to keep it consistent with the case exceeding down_threshold.
>> I'm not sure if this change of behavior is reasonable. Perhaps Rafael or
>> Viresh could give us some advice.
>
>>>> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
>>>> @@ -104,7 +104,7 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>>>> dbs_info->down_skip = 0;
>>>> /* if we are already at full speed then break out early */
>>>> - if (requested_freq == policy->max)
>>>> + if (dbs_info->requested_freq == policy->max)
>>>> goto out;
>
>>>> - if (requested_freq == policy->min)
>>>> + if (dbs_info->requested_freq == policy->min)
>
> What about dropping these `if` blocks completely ? i.e. always call
> __cpufreq_driver_target().
>
> __cpufreq_driver_target() already have similar checks in place to optimize
> unnecessary freq changes. We don't really need callers to do the same.
>
__cpufreq_driver_target() only skips this when !(cpufreq_driver->flags & CPUFREQ_NEED_UPDATE_LIMITS),
this means that for cppc_cpufreq, amd-pstate and intel_pstate the
unnecessary freq change will still be called.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-05-08 10:14 ` Viresh Kumar
@ 2026-05-09 2:01 ` zhenglifeng (A)
2026-05-11 4:03 ` Zhongqiu Han
1 sibling, 0 replies; 11+ messages in thread
From: zhenglifeng (A) @ 2026-05-09 2:01 UTC (permalink / raw)
To: Viresh Kumar, Zhongqiu Han
Cc: rafael, stratosk, linux-pm, linux-kernel, linuxarm, zhanjie9,
lihuisong, yubowen8, zhangpengjie2, wangzhi12, linhongye
On 5/8/2026 6:14 PM, Viresh Kumar wrote:
>
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..c7ec11de9a43 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -103,10 +103,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> if (load > dbs_data->up_threshold) {
> dbs_info->down_skip = 0;
>
> - /* if we are already at full speed then break out early */
> - if (requested_freq == policy->max)
> - goto out;
> -
> requested_freq += freq_step;
> if (requested_freq > policy->max)
> requested_freq = policy->max;
> @@ -124,15 +120,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>
> /* Check for frequency decrease */
> if (load < cs_tuners->down_threshold) {
> - /*
> - * if we cannot reduce the frequency anymore, break out early
> - */
> - if (requested_freq == policy->min)
> - goto out;
> -
> - if (requested_freq > freq_step)
> - requested_freq -= freq_step;
> - else
> + requested_freq -= freq_step;
Small probability but it is still possible that freq_step is larger than
requested_freq.
> + if (requested_freq < policy->min)
> requested_freq = policy->min;
>
> __cpufreq_driver_target(policy, requested_freq,
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target
2026-05-08 10:14 ` Viresh Kumar
2026-05-09 2:01 ` zhenglifeng (A)
@ 2026-05-11 4:03 ` Zhongqiu Han
1 sibling, 0 replies; 11+ messages in thread
From: Zhongqiu Han @ 2026-05-11 4:03 UTC (permalink / raw)
To: Viresh Kumar
Cc: zhenglifeng (A), rafael, stratosk, linux-pm, linux-kernel,
linuxarm, zhanjie9, lihuisong, yubowen8, zhangpengjie2, wangzhi12,
linhongye, zhongqiu.han
On 5/8/2026 6:14 PM, Viresh Kumar wrote:
> On 08-05-26, 14:46, Zhongqiu Han wrote:
>> Nice suggestion! — I think it aligns well with the general direction of
>> addressing this issue and relying more on the cpufreq core.
>>
>> That said, I have a concern if we drop the decrease-path early-exit
>> check while keeping the existing condition:
>>
>> if (requested_freq > freq_step)
>>
>> unchanged. In that case, when the system is already at policy->min, this
>> appears to introduce a governor-level oscillation, even though the
>> effective hardware frequency remains unchanged:
>>
>> Consider the following scenario
>> (policy->min = 200, freq_step = 100, hardware already at min, low load):
>>
>> Iteration N:
>> requested_freq = 200
>> decrease path: 200 > 100 -> requested_freq = 100
>> __cpufreq_driver_target(100, LE):
>> clamp -> 200, target == cur
>> -> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
>> -> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
>> dbs_info->requested_freq = 100 <- below min
>>
>> Iteration N+1:
>> requested_freq = 100
>> out-of-range check: reset to policy->cur = 200
>> decrease path: 200 > 100 -> requested_freq = 100
>> dbs_info->requested_freq = 100 <- below min again
>>
>> This results in a repeated oscillation pattern. For drivers with
>> CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
>> the driver may be invoked every sampling period even though the
>> effective frequency does not change. I'm happy to defer to your judgment
>> on whether this is significant enough to warrant further changes.
>>
>> Given this, may i know would it make sense to adjust only the decrease
>> path early-exit condition to use dbs_info->requested_freq (i.e. the last
>> target actually requested from hardware), rather than the idle-adjusted
>> local requested_freq?
>>
>> - if (requested_freq == policy->min)
>> + if (dbs_info->requested_freq == policy->min)
>> goto out;
>>
>> With this change, the scenario raised by Lifeng
>> (dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
>> follows:
>>
>> Iteration 1:
>> requested_freq = 400
>> idle decay: requested_freq = 200
>> if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
>> 200 > 100 -> requested_freq = 100
>> __cpufreq_driver_target(100, LE):
>> clamp -> 200, target(200) != cur(400)
>> -> driver invoked, hardware: 400 -> 200 MHz [bug fixed]
>> dbs_info->requested_freq = 100 <- one-time transient value
>>
>> Iteration 2:
>> requested_freq = 100
>> out-of-range check: reset to policy->cur = 200
>> if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out
>>
>> -> Stable from iteration 2 onward, with no extra driver calls.
>>
>> There is a one-time transient where dbs_info->requested_freq briefly
>> drops below policy->min, but this is harmless: the existing out-of-range
>> check corrects it in the next iteration. A similar situation was
>> discussed before [1], where the conclusion at the time was that
>> __cpufreq_driver_target() already performs the necessary clamping. If it
>> would be clearer or more robust to add an explicit guard
>> here, this can be revisited as well at your discretion.😊
>>
>> [1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/
>>
>> The increase-path early-exit is intentionally left unchanged, to avoid
>> altering the behavior in the scenario where conservative idle decay
>> would otherwise be ignored when the previous requested frequency was
>> already at policy->max.
>
> What about this ?
>
> diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
> index df01d33993d8..c7ec11de9a43 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -103,10 +103,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
> if (load > dbs_data->up_threshold) {
> dbs_info->down_skip = 0;
>
> - /* if we are already at full speed then break out early */
> - if (requested_freq == policy->max)
> - goto out;
> -
> requested_freq += freq_step;
> if (requested_freq > policy->max)
> requested_freq = policy->max;
> @@ -124,15 +120,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
>
> /* Check for frequency decrease */
> if (load < cs_tuners->down_threshold) {
> - /*
> - * if we cannot reduce the frequency anymore, break out early
> - */
> - if (requested_freq == policy->min)
> - goto out;
> -
> - if (requested_freq > freq_step)
> - requested_freq -= freq_step;
> - else
> + requested_freq -= freq_step;
> + if (requested_freq < policy->min)
> requested_freq = policy->min;
>
> __cpufreq_driver_target(policy, requested_freq,
>
Just one small concern is that requested_freq may underflow as an
unsigned value; perhaps this could be improved by:
- if (requested_freq > freq_step)
+ if (requested_freq > policy->min + freq_step)
requested_freq -= freq_step;
else
requested_freq = policy->min;
or use min() func?
Additionally, it seems that dropping the early‑exit checks also appears
to be a nice side effect fix for CPUFREQ_NEED_UPDATE_LIMITS drivers when
updating internal upper and lower frequency boundaries.
As designed in commit 1c534352f47f ("cpufreq: Introduce
CPUFREQ_NEED_UPDATE_LIMITS driver flag"), a cpufreq driver may need to
update its internal frequency limits when policy min or max changes for
drivers setting CPUFREQ_NEED_UPDATE_LIMITS.
However, the early‑exit in cs_dbs_update() can prevent
__cpufreq_driver_target() from ever being called.
For example, when policy->min rises from 200 to 400 kHz while policy
->cur is already at 400 kHz, under low load:
cpufreq_policy_apply_limits():
policy->min(400) > policy->cur(400) ? NO -> driver not called
cs_limits():
dbs_info->requested_freq = policy->cur = 400
[next sampling period, low load]
cs_dbs_update():
requested_freq = 400
if (requested_freq == policy->min) /* 400 == 400 -> true */
goto out; /* __cpufreq_driver_target() never called */
With the early‑exit removed, __cpufreq_driver_target() is called with
target == policy->cur, and CPUFREQ_NEED_UPDATE_LIMITS ensures the driver
is invoked to update its internal performance boundaries.
--
Thx and BRs,
Zhongqiu Han
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-11 4:03 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-21 12:35 [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target Lifeng Zheng
2026-04-22 8:06 ` Stratos Karafotis
2026-04-22 8:36 ` zhenglifeng (A)
2026-04-23 5:39 ` Zhongqiu Han
2026-04-23 7:12 ` zhenglifeng (A)
2026-05-07 9:33 ` Viresh Kumar
2026-05-08 6:46 ` Zhongqiu Han
2026-05-08 10:14 ` Viresh Kumar
2026-05-09 2:01 ` zhenglifeng (A)
2026-05-11 4:03 ` Zhongqiu Han
2026-05-09 1:54 ` zhenglifeng (A)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox