* Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
2026-04-23 8:47 ` [PATCH 1/1] " Pierre Gondois
@ 2026-04-27 3:08 ` Jie Zhan
2026-04-30 13:41 ` Pierre Gondois
2026-04-28 16:37 ` Sumit Gupta
2026-04-29 13:00 ` Zhongqiu Han
2 siblings, 1 reply; 8+ messages in thread
From: Jie Zhan @ 2026-04-27 3:08 UTC (permalink / raw)
To: Pierre Gondois, linux-kernel
Cc: Lifeng Zheng, Ionela Voinescu, Sumit Gupta, Huang Rui,
Mario Limonciello, Perry Yuan, K Prateek Nayak, Rafael J. Wysocki,
Viresh Kumar, Srinivas Pandruvada, Len Brown, Saravana Kannan,
linux-pm
Hi Pierre,
Thanks for updating this.
A few questions inline.
Regards,
Jie
On 4/23/2026 4:47 PM, Pierre Gondois wrote:
> cpufreq_set_policy() will ultimately override the policy min/max
> values written in the .init() callback through:
> cpufreq_policy_online()
> \-cpufreq_init_policy()
> \-cpufreq_set_policy()
> \-/* Set policy->min/max */
> Thus the policy min/max values provided are only temporary.
>
> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
Just a check, you mean this is the only place that policy->min/max set by
driver->init() may ever be actually used, right?
> cpufreq_policy_online()
> \-cpufreq_init_policy()
> \-__cpufreq_driver_target()
> \-cpufreq_driver->target()
cpufreq_init_policy() doesn't seem to be involved here?
It's supposed to be:
cpufreq_policy_online()
\-__cpufreq_driver_target() /* CPUFREQ_NEED_INITIAL_FREQ_CHECK branch */
\-cpufreq_driver->target()
> is called. To avoid any regression, set policy->min/max in cpufreq.c
> if the values were not initialized.
>
> In this patch:
> - Setting policy->min or max value in driver .init() cb is
> interpreted as setting a QoS constraint.
> - Remove policy->min/max initialization in drivers if the values
> are similar to policy->cpuinfo.min_freq/max_freq.
Why is this necessary?
Doing this will touch many drivers.
Is this mainly for cleaning up? or is there any bugs if we directly take
the existing policy->min/max initialized by drivers (mostly equal to
cpufreq_min/max_freq) as QoS constraints?
> The only drivers where these values are different are:
> - gx-suspmod.c
> - cppc-cpufreq.c
> - longrun.c
> - For the cppc-cpufreq driver, the lowest non-linear freq. is
> used as a min QoS constraint as suggested at:
> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>
> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
> ---
> drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
> drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
> drivers/cpufreq/freq_table.c | 7 +++----
> drivers/cpufreq/gx-suspmod.c | 9 +++++----
> drivers/cpufreq/intel_pstate.c | 3 ---
> drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
> drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
> drivers/cpufreq/sh-cpufreq.c | 4 ++--
> drivers/cpufreq/virtual-cpufreq.c | 5 +----
> 11 files changed, 51 insertions(+), 39 deletions(-)
>
[...]
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
> cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>
> if (new_policy) {
> + unsigned int min, max;
> +
> + /* Use policy->min/max set by the driver as QoS requests. */
> + min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
> + if (policy->max)
> + max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
> + else
> + max = FREQ_QOS_MAX_DEFAULT_VALUE;
Is it practical (free of bugs) to set the default policy->min/max to
FREQ_QOS_MIN/MAX_DEFAULT_VALUE in cpufreq_policy_alloc(), and keep drivers
initializing policy->min/max?
Such that we may only need to change the following two lines of
FREQ_QOS_MIN/MAX_DEFAULT_VALUE to policy->min/max in this function, without
adding the other two trunks.
> for_each_cpu(j, policy->related_cpus) {
> per_cpu(cpufreq_cpu_data, j) = policy;
> add_cpu_dev_symlink(policy, j, get_cpu_device(j));
> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>
> ret = freq_qos_add_request(&policy->constraints,
> &policy->min_freq_req, FREQ_QOS_MIN,
> - FREQ_QOS_MIN_DEFAULT_VALUE);
> + min);
> if (ret < 0)
> goto out_destroy_policy;
>
> ret = freq_qos_add_request(&policy->constraints,
> &policy->max_freq_req, FREQ_QOS_MAX,
> - FREQ_QOS_MAX_DEFAULT_VALUE);
> + max);
> if (ret < 0)
> goto out_destroy_policy;
>
> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
> CPUFREQ_CREATE_POLICY, policy);
> +
> + /*
> + * If the driver didn't set QoS constraints, policy->min/max still
> + * need to be set as they are used to clamp frequency requests.
> + */
> + policy->min = policy->min ? policy->min : policy->cpuinfo.min_freq;
> + policy->max = policy->max ? policy->max : policy->cpuinfo.max_freq;
> }
>
> if (cpufreq_driver->get && has_target()) {
[...]
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
2026-04-27 3:08 ` Jie Zhan
@ 2026-04-30 13:41 ` Pierre Gondois
0 siblings, 0 replies; 8+ messages in thread
From: Pierre Gondois @ 2026-04-30 13:41 UTC (permalink / raw)
To: Jie Zhan, linux-kernel
Cc: Lifeng Zheng, Ionela Voinescu, Sumit Gupta, Huang Rui,
Mario Limonciello, Perry Yuan, K Prateek Nayak, Rafael J. Wysocki,
Viresh Kumar, Srinivas Pandruvada, Len Brown, Saravana Kannan,
linux-pm
Hello Jie,
On 4/27/26 05:08, Jie Zhan wrote:
> Hi Pierre,
>
> Thanks for updating this.
> A few questions inline.
>
> Regards,
> Jie
>
> On 4/23/2026 4:47 PM, Pierre Gondois wrote:
>> cpufreq_set_policy() will ultimately override the policy min/max
>> values written in the .init() callback through:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-cpufreq_set_policy()
>> \-/* Set policy->min/max */
>> Thus the policy min/max values provided are only temporary.
>>
>> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
> Just a check, you mean this is the only place that policy->min/max set by
> driver->init() may ever be actually used, right?
I think I meant that this is the only place which might use
policy->min or max between:
- cpufreq_driver->init (); ## which set policy->min/max
- cpufreq_set_policy(); ## which reset policy->min/max
so the min/max values set by the driver .init() callback
should only be used in the above call stack.
------
This patch sets policy->min/max values with:
/*
If the driver didn't set QoS constraints, policy->min/max still
need to be set as they are used to clamp frequency requests.
*/
So even if there are other users in the meantime, they
should have correct values.
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-__cpufreq_driver_target()
>> \-cpufreq_driver->target()
> cpufreq_init_policy() doesn't seem to be involved here?
> It's supposed to be:
> cpufreq_policy_online()
> \-__cpufreq_driver_target() /* CPUFREQ_NEED_INITIAL_FREQ_CHECK branch */
> \-cpufreq_driver->target()
Yes right indeed, cpufreq_init_policy() should be removed.
>> is called. To avoid any regression, set policy->min/max in cpufreq.c
>> if the values were not initialized.
>>
>> In this patch:
>> - Setting policy->min or max value in driver .init() cb is
>> interpreted as setting a QoS constraint.
>> - Remove policy->min/max initialization in drivers if the values
>> are similar to policy->cpuinfo.min_freq/max_freq.
> Why is this necessary?
> Doing this will touch many drivers.
> Is this mainly for cleaning up? or is there any bugs if we directly take
> the existing policy->min/max initialized by drivers (mostly equal to
> cpufreq_min/max_freq) as QoS constraints?
I think I should also mention the following commit in the
message:
521223d8b3ec ("cpufreq: Fix initialization of min and max frequency QoS
requests")
Prior to that commit, drivers were setting policy->min/max and
the value was used as a QoS constraint. After that, the value
was only temporarily used. Thus this commit helps coming
back to the first behaviour.
Pengjie Zhang recently wanted to be able to set a default min frequency
for the cppc driver at [1]. So this also helps for his case.
[1]
https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>> The only drivers where these values are different are:
>> - gx-suspmod.c
>> - cppc-cpufreq.c
>> - longrun.c
>> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>> used as a min QoS constraint as suggested at:
>> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>>
>> Signed-off-by: Pierre Gondois<pierre.gondois@arm.com>
>> ---
>> drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
>> drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
>> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
>> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
>> drivers/cpufreq/freq_table.c | 7 +++----
>> drivers/cpufreq/gx-suspmod.c | 9 +++++----
>> drivers/cpufreq/intel_pstate.c | 3 ---
>> drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
>> drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
>> drivers/cpufreq/sh-cpufreq.c | 4 ++--
>> drivers/cpufreq/virtual-cpufreq.c | 5 +----
>> 11 files changed, 51 insertions(+), 39 deletions(-)
>>
> [...]
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>> cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>>
>> if (new_policy) {
>> + unsigned int min, max;
>> +
>> + /* Use policy->min/max set by the driver as QoS requests. */
>> + min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
>> + if (policy->max)
>> + max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
>> + else
>> + max = FREQ_QOS_MAX_DEFAULT_VALUE;
> Is it practical (free of bugs) to set the default policy->min/max to
> FREQ_QOS_MIN/MAX_DEFAULT_VALUE in cpufreq_policy_alloc(), and keep drivers
> initializing policy->min/max?
>
> Such that we may only need to change the following two lines of
> FREQ_QOS_MIN/MAX_DEFAULT_VALUE to policy->min/max in this function, without
> adding the other two trunks.
One other goal of the patch is to distinguish drivers which:
1. actually requested a QoS min/max limit
2. have set min/max values by default, but don't actually meant it.
Cf. 521223d8b3ec ("cpufreq: Fix initialization of min and max frequency
QoS requests")
Setting policy->max as a QoS constraint might result in performance
limitations, so we should try to avoid that.
FWIU, removing the default initialization of policy->min/max
must be done to handle case 2., but maybe I missed something.
>> for_each_cpu(j, policy->related_cpus) {
>> per_cpu(cpufreq_cpu_data, j) = policy;
>> add_cpu_dev_symlink(policy, j, get_cpu_device(j));
>> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>>
>> ret = freq_qos_add_request(&policy->constraints,
>> &policy->min_freq_req, FREQ_QOS_MIN,
>> - FREQ_QOS_MIN_DEFAULT_VALUE);
>> + min);
>> if (ret < 0)
>> goto out_destroy_policy;
>>
>> ret = freq_qos_add_request(&policy->constraints,
>> &policy->max_freq_req, FREQ_QOS_MAX,
>> - FREQ_QOS_MAX_DEFAULT_VALUE);
>> + max);
>> if (ret < 0)
>> goto out_destroy_policy;
>>
>> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>> CPUFREQ_CREATE_POLICY, policy);
>> +
>> + /*
>> + * If the driver didn't set QoS constraints, policy->min/max still
>> + * need to be set as they are used to clamp frequency requests.
>> + */
>> + policy->min = policy->min ? policy->min : policy->cpuinfo.min_freq;
>> + policy->max = policy->max ? policy->max : policy->cpuinfo.max_freq;
>> }
>>
>> if (cpufreq_driver->get && has_target()) {
> [...]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
2026-04-23 8:47 ` [PATCH 1/1] " Pierre Gondois
2026-04-27 3:08 ` Jie Zhan
@ 2026-04-28 16:37 ` Sumit Gupta
2026-04-30 13:41 ` Pierre Gondois
2026-04-29 13:00 ` Zhongqiu Han
2 siblings, 1 reply; 8+ messages in thread
From: Sumit Gupta @ 2026-04-28 16:37 UTC (permalink / raw)
To: Pierre Gondois, linux-kernel
Cc: Jie Zhan, Lifeng Zheng, Ionela Voinescu, Huang Rui,
Mario Limonciello, Perry Yuan, K Prateek Nayak, Rafael J. Wysocki,
Viresh Kumar, Srinivas Pandruvada, Len Brown, Saravana Kannan,
linux-pm
On 23/04/26 14:17, Pierre Gondois wrote:
> External email: Use caution opening links or attachments
>
>
> cpufreq_set_policy() will ultimately override the policy min/max
> values written in the .init() callback through:
> cpufreq_policy_online()
> \-cpufreq_init_policy()
> \-cpufreq_set_policy()
> \-/* Set policy->min/max */
> Thus the policy min/max values provided are only temporary.
>
> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
> cpufreq_policy_online()
> \-cpufreq_init_policy()
> \-__cpufreq_driver_target()
> \-cpufreq_driver->target()
> is called. To avoid any regression, set policy->min/max in cpufreq.c
> if the values were not initialized.
>
> In this patch:
> - Setting policy->min or max value in driver .init() cb is
> interpreted as setting a QoS constraint.
> - Remove policy->min/max initialization in drivers if the values
> are similar to policy->cpuinfo.min_freq/max_freq.
> The only drivers where these values are different are:
> - gx-suspmod.c
> - cppc-cpufreq.c
> - longrun.c
> - For the cppc-cpufreq driver, the lowest non-linear freq. is
> used as a min QoS constraint as suggested at:
> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>
> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Tested-by: Sumit Gupta <sumitg@nvidia.com>
Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
> ---
> drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
> drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
> drivers/cpufreq/freq_table.c | 7 +++----
> drivers/cpufreq/gx-suspmod.c | 9 +++++----
> drivers/cpufreq/intel_pstate.c | 3 ---
> drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
> drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
> drivers/cpufreq/sh-cpufreq.c | 4 ++--
> drivers/cpufreq/virtual-cpufreq.c | 5 +----
> 11 files changed, 51 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 453084c67327f..1ed4bcdcc957f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>
> perf = READ_ONCE(cpudata->perf);
>
> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
> - cpudata->nominal_freq,
> - perf.lowest_perf);
> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
> + policy->cpuinfo.min_freq = perf_to_freq(perf,
> + cpudata->nominal_freq,
> + perf.lowest_perf);
> + policy->cpuinfo.max_freq = cpudata->max_freq;
>
> policy->driver_data = cpudata;
> ret = amd_pstate_cppc_enable(policy);
> @@ -1907,10 +1907,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>
> perf = READ_ONCE(cpudata->perf);
>
> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
> - cpudata->nominal_freq,
> - perf.lowest_perf);
> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
> + policy->cpuinfo.min_freq = perf_to_freq(perf,
> + cpudata->nominal_freq,
> + perf.lowest_perf);
> + policy->cpuinfo.max_freq = cpudata->max_freq;
> policy->driver_data = cpudata;
>
> ret = amd_pstate_cppc_enable(policy);
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> unsigned int cpu = policy->cpu;
> struct cppc_cpudata *cpu_data;
> struct cppc_perf_caps *caps;
> + unsigned int min, max;
> int ret;
>
> cpu_data = cppc_cpufreq_get_cpu_data(cpu);
> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> caps = &cpu_data->perf_caps;
> policy->driver_data = cpu_data;
>
> + min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
> + max = cppc_perf_to_khz(caps, policy->boost_enabled ?
> + caps->highest_perf : caps->nominal_perf);
> +
> /*
> * Set min to lowest nonlinear perf to avoid any efficiency penalty (see
> * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
> */
> - policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
> - policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
> - caps->highest_perf : caps->nominal_perf);
> + policy->min = min;
>
> /*
> * Set cpuinfo.min_freq to Lowest to make the full range of performance
> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> * nonlinear perf
> */
> policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
> - policy->cpuinfo.max_freq = policy->max;
> + policy->cpuinfo.max_freq = max;
>
> policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us(cpu);
> policy->shared_type = cpu_data->shared_type;
> diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/cpufreq-nforce2.c
> index fbbbe501cf2dc..831102522ad64 100644
> --- a/drivers/cpufreq/cpufreq-nforce2.c
> +++ b/drivers/cpufreq/cpufreq-nforce2.c
> @@ -355,8 +355,8 @@ static int nforce2_cpu_init(struct cpufreq_policy *policy)
> min_fsb = NFORCE2_MIN_FSB;
>
> /* cpuinfo and default policy values */
> - policy->min = policy->cpuinfo.min_freq = min_fsb * fid * 100;
> - policy->max = policy->cpuinfo.max_freq = max_fsb * fid * 100;
> + policy->cpuinfo.min_freq = min_fsb * fid * 100;
> + policy->cpuinfo.max_freq = max_fsb * fid * 100;
>
> return 0;
> }
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
> cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>
> if (new_policy) {
> + unsigned int min, max;
> +
> + /* Use policy->min/max set by the driver as QoS requests. */
> + min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
> + if (policy->max)
> + max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
> + else
> + max = FREQ_QOS_MAX_DEFAULT_VALUE;
> for_each_cpu(j, policy->related_cpus) {
> per_cpu(cpufreq_cpu_data, j) = policy;
> add_cpu_dev_symlink(policy, j, get_cpu_device(j));
> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>
> ret = freq_qos_add_request(&policy->constraints,
> &policy->min_freq_req, FREQ_QOS_MIN,
> - FREQ_QOS_MIN_DEFAULT_VALUE);
> + min);
> if (ret < 0)
> goto out_destroy_policy;
>
> ret = freq_qos_add_request(&policy->constraints,
> &policy->max_freq_req, FREQ_QOS_MAX,
> - FREQ_QOS_MAX_DEFAULT_VALUE);
> + max);
> if (ret < 0)
> goto out_destroy_policy;
>
> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
> CPUFREQ_CREATE_POLICY, policy);
> +
> + /*
> + * If the driver didn't set QoS constraints, policy->min/max still
> + * need to be set as they are used to clamp frequency requests.
> + */
> + policy->min = policy->min ? policy->min : policy->cpuinfo.min_freq;
> + policy->max = policy->max ? policy->max : policy->cpuinfo.max_freq;
> }
>
> if (cpufreq_driver->get && has_target()) {
> diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c
> index 5b364d8da4f92..ea994647abc88 100644
> --- a/drivers/cpufreq/freq_table.c
> +++ b/drivers/cpufreq/freq_table.c
> @@ -49,16 +49,15 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy)
> max_freq = freq;
> }
>
> - policy->min = policy->cpuinfo.min_freq = min_freq;
> - policy->max = max_freq;
> + policy->cpuinfo.min_freq = min_freq;
> /*
> * If the driver has set its own cpuinfo.max_freq above max_freq, leave
> * it as is.
> */
> if (policy->cpuinfo.max_freq < max_freq)
> - policy->max = policy->cpuinfo.max_freq = max_freq;
> + policy->cpuinfo.max_freq = max_freq;
>
> - if (policy->min == ~0)
> + if (min_freq == ~0)
> return -EINVAL;
> else
> return 0;
> diff --git a/drivers/cpufreq/gx-suspmod.c b/drivers/cpufreq/gx-suspmod.c
> index d269a4f26f98e..ebda2bbebf44b 100644
> --- a/drivers/cpufreq/gx-suspmod.c
> +++ b/drivers/cpufreq/gx-suspmod.c
> @@ -397,7 +397,7 @@ static int cpufreq_gx_target(struct cpufreq_policy *policy,
>
> static int cpufreq_gx_cpu_init(struct cpufreq_policy *policy)
> {
> - unsigned int maxfreq;
> + unsigned int minfreq, maxfreq;
>
> if (!policy || policy->cpu != 0)
> return -ENODEV;
> @@ -418,10 +418,11 @@ static int cpufreq_gx_cpu_init(struct cpufreq_policy *policy)
> policy->cpu = 0;
>
> if (max_duration < POLICY_MIN_DIV)
> - policy->min = maxfreq / max_duration;
> + minfreq = maxfreq / max_duration;
> else
> - policy->min = maxfreq / POLICY_MIN_DIV;
> - policy->max = maxfreq;
> + minfreq = maxfreq / POLICY_MIN_DIV;
> +
> + policy->min = minfreq;
> policy->cpuinfo.min_freq = maxfreq / max_duration;
> policy->cpuinfo.max_freq = maxfreq;
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 1292da53e5fcb..68ccc6eb1ef30 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -3049,9 +3049,6 @@ static int __intel_pstate_cpu_init(struct cpufreq_policy *policy)
> policy->cpuinfo.max_freq = READ_ONCE(global.no_turbo) ?
> cpu->pstate.max_freq : cpu->pstate.turbo_freq;
>
> - policy->min = policy->cpuinfo.min_freq;
> - policy->max = policy->cpuinfo.max_freq;
> -
> intel_pstate_init_acpi_perf_limits(policy);
>
> policy->fast_switch_possible = true;
> diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
> index ac2e90a65f0c4..231edfe8cabaa 100644
> --- a/drivers/cpufreq/pcc-cpufreq.c
> +++ b/drivers/cpufreq/pcc-cpufreq.c
> @@ -551,13 +551,13 @@ static int pcc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> goto out;
> }
>
> - policy->max = policy->cpuinfo.max_freq =
> + policy->cpuinfo.max_freq =
> ioread32(&pcch_hdr->nominal) * 1000;
> - policy->min = policy->cpuinfo.min_freq =
> + policy->cpuinfo.min_freq =
> ioread32(&pcch_hdr->minimum_frequency) * 1000;
>
> - pr_debug("init: policy->max is %d, policy->min is %d\n",
> - policy->max, policy->min);
> + pr_debug("init: max_freq is %d, min_freq is %d\n",
> + policy->cpuinfo.max_freq, policy->cpuinfo.min_freq);
> out:
> return result;
> }
> diff --git a/drivers/cpufreq/pxa3xx-cpufreq.c b/drivers/cpufreq/pxa3xx-cpufreq.c
> index 50ff3b6a69000..181962b0924e6 100644
> --- a/drivers/cpufreq/pxa3xx-cpufreq.c
> +++ b/drivers/cpufreq/pxa3xx-cpufreq.c
> @@ -185,8 +185,8 @@ static int pxa3xx_cpufreq_init(struct cpufreq_policy *policy)
> int ret = -EINVAL;
>
> /* set default policy and cpuinfo */
> - policy->min = policy->cpuinfo.min_freq = 104000;
> - policy->max = policy->cpuinfo.max_freq =
> + policy->cpuinfo.min_freq = 104000;
> + policy->cpuinfo.max_freq =
> (cpu_is_pxa320()) ? 806000 : 624000;
> policy->cpuinfo.transition_latency = 1000; /* FIXME: 1 ms, assumed */
>
> diff --git a/drivers/cpufreq/sh-cpufreq.c b/drivers/cpufreq/sh-cpufreq.c
> index 642ddb9ea217e..244153a1cead2 100644
> --- a/drivers/cpufreq/sh-cpufreq.c
> +++ b/drivers/cpufreq/sh-cpufreq.c
> @@ -124,9 +124,9 @@ static int sh_cpufreq_cpu_init(struct cpufreq_policy *policy)
> dev_notice(dev, "no frequency table found, falling back "
> "to rate rounding.\n");
>
> - policy->min = policy->cpuinfo.min_freq =
> + policy->cpuinfo.min_freq =
> (clk_round_rate(cpuclk, 1) + 500) / 1000;
> - policy->max = policy->cpuinfo.max_freq =
> + policy->cpuinfo.max_freq =
> (clk_round_rate(cpuclk, ~0UL) + 500) / 1000;
> }
>
> diff --git a/drivers/cpufreq/virtual-cpufreq.c b/drivers/cpufreq/virtual-cpufreq.c
> index 4159f31349b16..dc78b74409af4 100644
> --- a/drivers/cpufreq/virtual-cpufreq.c
> +++ b/drivers/cpufreq/virtual-cpufreq.c
> @@ -164,10 +164,7 @@ static int virt_cpufreq_get_freq_info(struct cpufreq_policy *policy)
> policy->cpuinfo.min_freq = 1;
> policy->cpuinfo.max_freq = virt_cpufreq_get_perftbl_entry(policy->cpu, 0);
>
> - policy->min = policy->cpuinfo.min_freq;
> - policy->max = policy->cpuinfo.max_freq;
> -
> - policy->cur = policy->max;
> + policy->cur = policy->cpuinfo.max_freq;
> return 0;
> }
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
2026-04-28 16:37 ` Sumit Gupta
@ 2026-04-30 13:41 ` Pierre Gondois
0 siblings, 0 replies; 8+ messages in thread
From: Pierre Gondois @ 2026-04-30 13:41 UTC (permalink / raw)
To: Sumit Gupta, linux-kernel
Cc: Jie Zhan, Lifeng Zheng, Ionela Voinescu, Huang Rui,
Mario Limonciello, Perry Yuan, K Prateek Nayak, Rafael J. Wysocki,
Viresh Kumar, Srinivas Pandruvada, Len Brown, Saravana Kannan,
linux-pm
Hello Sumit,
On 4/28/26 18:37, Sumit Gupta wrote:
>
> On 23/04/26 14:17, Pierre Gondois wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> cpufreq_set_policy() will ultimately override the policy min/max
>> values written in the .init() callback through:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-cpufreq_set_policy()
>> \-/* Set policy->min/max */
>> Thus the policy min/max values provided are only temporary.
>>
>> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-__cpufreq_driver_target()
>> \-cpufreq_driver->target()
>> is called. To avoid any regression, set policy->min/max in cpufreq.c
>> if the values were not initialized.
>>
>> In this patch:
>> - Setting policy->min or max value in driver .init() cb is
>> interpreted as setting a QoS constraint.
>> - Remove policy->min/max initialization in drivers if the values
>> are similar to policy->cpuinfo.min_freq/max_freq.
>> The only drivers where these values are different are:
>> - gx-suspmod.c
>> - cppc-cpufreq.c
>> - longrun.c
>> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>> used as a min QoS constraint as suggested at:
>> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>
> Tested-by: Sumit Gupta <sumitg@nvidia.com>
> Reviewed-by: Sumit Gupta <sumitg@nvidia.com>
Thanks!
There might be a v2 with some small modifications.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
2026-04-23 8:47 ` [PATCH 1/1] " Pierre Gondois
2026-04-27 3:08 ` Jie Zhan
2026-04-28 16:37 ` Sumit Gupta
@ 2026-04-29 13:00 ` Zhongqiu Han
2026-04-30 13:41 ` Pierre Gondois
2 siblings, 1 reply; 8+ messages in thread
From: Zhongqiu Han @ 2026-04-29 13:00 UTC (permalink / raw)
To: Pierre Gondois, linux-kernel
Cc: Jie Zhan, Lifeng Zheng, Ionela Voinescu, Sumit Gupta, Huang Rui,
Mario Limonciello, Perry Yuan, K Prateek Nayak, Rafael J. Wysocki,
Viresh Kumar, Srinivas Pandruvada, Len Brown, Saravana Kannan,
linux-pm, zhongqiu.han
On 4/23/2026 4:47 PM, Pierre Gondois wrote:
> cpufreq_set_policy() will ultimately override the policy min/max
> values written in the .init() callback through:
> cpufreq_policy_online()
> \-cpufreq_init_policy()
> \-cpufreq_set_policy()
> \-/* Set policy->min/max */
> Thus the policy min/max values provided are only temporary.
>
> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
> cpufreq_policy_online()
> \-cpufreq_init_policy()
> \-__cpufreq_driver_target()
> \-cpufreq_driver->target()
> is called. To avoid any regression, set policy->min/max in cpufreq.c
> if the values were not initialized.
>
> In this patch:
> - Setting policy->min or max value in driver .init() cb is
> interpreted as setting a QoS constraint.
> - Remove policy->min/max initialization in drivers if the values
> are similar to policy->cpuinfo.min_freq/max_freq.
> The only drivers where these values are different are:
> - gx-suspmod.c
> - cppc-cpufreq.c
> - longrun.c
> - For the cppc-cpufreq driver, the lowest non-linear freq. is
> used as a min QoS constraint as suggested at:
> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>
> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Hi Pierre,
Thanks for the patch. I have a few additional inline comments/questions
below.
> ---
> drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
> drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
> drivers/cpufreq/freq_table.c | 7 +++----
> drivers/cpufreq/gx-suspmod.c | 9 +++++----
> drivers/cpufreq/intel_pstate.c | 3 ---
> drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
> drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
> drivers/cpufreq/sh-cpufreq.c | 4 ++--
> drivers/cpufreq/virtual-cpufreq.c | 5 +----
> 11 files changed, 51 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 453084c67327f..1ed4bcdcc957f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>
> perf = READ_ONCE(cpudata->perf);
>
> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
> - cpudata->nominal_freq,
> - perf.lowest_perf);
> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
> + policy->cpuinfo.min_freq = perf_to_freq(perf,
> + cpudata->nominal_freq,
> + perf.lowest_perf);
> + policy->cpuinfo.max_freq = cpudata->max_freq;
It is better to update doc as well to avoid new dirver developmenter set
policy->min / policy->max again?
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/cpu-drivers.rst#n102
>
> policy->driver_data = cpudata;
> ret = amd_pstate_cppc_enable(policy);
> @@ -1907,10 +1907,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>
> perf = READ_ONCE(cpudata->perf);
>
> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
> - cpudata->nominal_freq,
> - perf.lowest_perf);
> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
> + policy->cpuinfo.min_freq = perf_to_freq(perf,
> + cpudata->nominal_freq,
> + perf.lowest_perf);
> + policy->cpuinfo.max_freq = cpudata->max_freq;
> policy->driver_data = cpudata;
>
> ret = amd_pstate_cppc_enable(policy);
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> unsigned int cpu = policy->cpu;
> struct cppc_cpudata *cpu_data;
> struct cppc_perf_caps *caps;
> + unsigned int min, max;
> int ret;
>
> cpu_data = cppc_cpufreq_get_cpu_data(cpu);
> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> caps = &cpu_data->perf_caps;
> policy->driver_data = cpu_data;
>
> + min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
> + max = cppc_perf_to_khz(caps, policy->boost_enabled ?
> + caps->highest_perf : caps->nominal_perf);
> +
> /*
> * Set min to lowest nonlinear perf to avoid any efficiency penalty (see
> * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
> */
> - policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
> - policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
> - caps->highest_perf : caps->nominal_perf);
> + policy->min = min;
>
> /*
> * Set cpuinfo.min_freq to Lowest to make the full range of performance
> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> * nonlinear perf
> */
> policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
> - policy->cpuinfo.max_freq = policy->max;
> + policy->cpuinfo.max_freq = max;
>
> policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us(cpu);
> policy->shared_type = cpu_data->shared_type;
> diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/cpufreq-nforce2.c
> index fbbbe501cf2dc..831102522ad64 100644
> --- a/drivers/cpufreq/cpufreq-nforce2.c
> +++ b/drivers/cpufreq/cpufreq-nforce2.c
> @@ -355,8 +355,8 @@ static int nforce2_cpu_init(struct cpufreq_policy *policy)
> min_fsb = NFORCE2_MIN_FSB;
>
> /* cpuinfo and default policy values */
> - policy->min = policy->cpuinfo.min_freq = min_fsb * fid * 100;
> - policy->max = policy->cpuinfo.max_freq = max_fsb * fid * 100;
> + policy->cpuinfo.min_freq = min_fsb * fid * 100;
> + policy->cpuinfo.max_freq = max_fsb * fid * 100;
>
> return 0;
> }
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
> cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>
> if (new_policy) {
> + unsigned int min, max;
> +
> + /* Use policy->min/max set by the driver as QoS requests. */
> + min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
> + if (policy->max)
> + max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
> + else
> + max = FREQ_QOS_MAX_DEFAULT_VALUE;
Nit: Using local variables named min/max is confusing here since they
shadow the common min()/max() macros; renaming them (e.g. min_freq
/ max_freq) would improve readability and maintainability.
> for_each_cpu(j, policy->related_cpus) {
> per_cpu(cpufreq_cpu_data, j) = policy;
> add_cpu_dev_symlink(policy, j, get_cpu_device(j));
> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>
> ret = freq_qos_add_request(&policy->constraints,
> &policy->min_freq_req, FREQ_QOS_MIN,
> - FREQ_QOS_MIN_DEFAULT_VALUE);
> + min);
It seems that the current patch is not merely a superficial cleanup; it
also changes the policy->min value in the GX driver, setting it to the
5% value expected by the driver. If so, we should document it in the
commit message.
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/
tree/drivers/cpufreq/gx-suspmod.c#n137
/* gx-suspmod.c constants:
* POLICY_MIN_DIV = 20
* max_duration = 255
* maxfreq = e.g. 300000 kHz (300 MHz)
*/
cpufreq_policy_online()
cpufreq_driver->init(policy) /* cpufreq_gx_cpu_init() */
policy->min = maxfreq/20 /* 15000 kHz, 5% */
cpuinfo.min_freq = maxfreq/255 /* 1176 kHz, 0.39% */
/* Before current patch: 0, not policy->min */
freq_qos_add_request(..., FREQ_QOS_MIN, 0)
cpufreq_init_policy()
cpufreq_set_policy()
/* reads QoS=0, discards init()'s 15000 */
new_data.min = freq_qos_read_value(FREQ_QOS_MIN)
cpufreq_gx_verify()
cpufreq_verify_within_cpu_limits()
/* 0 < 1176: clamp to hw floor */
new_data.min = cpuinfo.min_freq /* 1176 kHz */
WRITE_ONCE(policy->min, 1176) /* 0.39%, not 5% */
After current patch:
freq_qos_add_request(..., FREQ_QOS_MIN, policy->min)
=> new_data.min stays 15000, no clamping, policy->min = 15000
> if (ret < 0)
> goto out_destroy_policy;
>
> ret = freq_qos_add_request(&policy->constraints,
> &policy->max_freq_req, FREQ_QOS_MAX,
> - FREQ_QOS_MAX_DEFAULT_VALUE);
> + max);
> if (ret < 0)
> goto out_destroy_policy;
>
> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
> CPUFREQ_CREATE_POLICY, policy);
> +
> + /*
> + * If the driver didn't set QoS constraints, policy->min/max still
> + * need to be set as they are used to clamp frequency requests.
> + */
> + policy->min = policy->min ? policy->min : policy->cpuinfo.min_freq;
> + policy->max = policy->max ? policy->max : policy->cpuinfo.max_freq;
Does it make sense to set policy->min / policy->max before the
CPUFREQ_CREATE_POLICY notifier, since some drivers may use them in their
callbacks?
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/core.rst#n58
> }
>
> if (cpufreq_driver->get && has_target()) {
> diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c
> index 5b364d8da4f92..ea994647abc88 100644
> --- a/drivers/cpufreq/freq_table.c
> +++ b/drivers/cpufreq/freq_table.c
> @@ -49,16 +49,15 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy)
> max_freq = freq;
> }
>
> - policy->min = policy->cpuinfo.min_freq = min_freq;
> - policy->max = max_freq;
> + policy->cpuinfo.min_freq = min_freq;
> /*
> * If the driver has set its own cpuinfo.max_freq above max_freq, leave
> * it as is.
> */
> if (policy->cpuinfo.max_freq < max_freq)
> - policy->max = policy->cpuinfo.max_freq = max_freq;
> + policy->cpuinfo.max_freq = max_freq;
>
> - if (policy->min == ~0)
> + if (min_freq == ~0)
> return -EINVAL;
> else
> return 0;
> diff --git a/drivers/cpufreq/gx-suspmod.c b/drivers/cpufreq/gx-suspmod.c
> index d269a4f26f98e..ebda2bbebf44b 100644
> --- a/drivers/cpufreq/gx-suspmod.c
> +++ b/drivers/cpufreq/gx-suspmod.c
> @@ -397,7 +397,7 @@ static int cpufreq_gx_target(struct cpufreq_policy *policy,
>
> static int cpufreq_gx_cpu_init(struct cpufreq_policy *policy)
> {
> - unsigned int maxfreq;
> + unsigned int minfreq, maxfreq;
>
> if (!policy || policy->cpu != 0)
> return -ENODEV;
> @@ -418,10 +418,11 @@ static int cpufreq_gx_cpu_init(struct cpufreq_policy *policy)
> policy->cpu = 0;
>
> if (max_duration < POLICY_MIN_DIV)
> - policy->min = maxfreq / max_duration;
> + minfreq = maxfreq / max_duration;
> else
> - policy->min = maxfreq / POLICY_MIN_DIV;
> - policy->max = maxfreq;
> + minfreq = maxfreq / POLICY_MIN_DIV;
> +
> + policy->min = minfreq;
> policy->cpuinfo.min_freq = maxfreq / max_duration;
> policy->cpuinfo.max_freq = maxfreq;
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 1292da53e5fcb..68ccc6eb1ef30 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -3049,9 +3049,6 @@ static int __intel_pstate_cpu_init(struct cpufreq_policy *policy)
> policy->cpuinfo.max_freq = READ_ONCE(global.no_turbo) ?
> cpu->pstate.max_freq : cpu->pstate.turbo_freq;
>
> - policy->min = policy->cpuinfo.min_freq;
> - policy->max = policy->cpuinfo.max_freq;
> -
> intel_pstate_init_acpi_perf_limits(policy);
>
> policy->fast_switch_possible = true;
> diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
> index ac2e90a65f0c4..231edfe8cabaa 100644
> --- a/drivers/cpufreq/pcc-cpufreq.c
> +++ b/drivers/cpufreq/pcc-cpufreq.c
> @@ -551,13 +551,13 @@ static int pcc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> goto out;
> }
>
> - policy->max = policy->cpuinfo.max_freq =
> + policy->cpuinfo.max_freq =
> ioread32(&pcch_hdr->nominal) * 1000;
> - policy->min = policy->cpuinfo.min_freq =
> + policy->cpuinfo.min_freq =
> ioread32(&pcch_hdr->minimum_frequency) * 1000;
>
> - pr_debug("init: policy->max is %d, policy->min is %d\n",
> - policy->max, policy->min);
> + pr_debug("init: max_freq is %d, min_freq is %d\n",
> + policy->cpuinfo.max_freq, policy->cpuinfo.min_freq);
> out:
> return result;
> }
> diff --git a/drivers/cpufreq/pxa3xx-cpufreq.c b/drivers/cpufreq/pxa3xx-cpufreq.c
> index 50ff3b6a69000..181962b0924e6 100644
> --- a/drivers/cpufreq/pxa3xx-cpufreq.c
> +++ b/drivers/cpufreq/pxa3xx-cpufreq.c
> @@ -185,8 +185,8 @@ static int pxa3xx_cpufreq_init(struct cpufreq_policy *policy)
> int ret = -EINVAL;
>
> /* set default policy and cpuinfo */
> - policy->min = policy->cpuinfo.min_freq = 104000;
> - policy->max = policy->cpuinfo.max_freq =
> + policy->cpuinfo.min_freq = 104000;
> + policy->cpuinfo.max_freq =
> (cpu_is_pxa320()) ? 806000 : 624000;
> policy->cpuinfo.transition_latency = 1000; /* FIXME: 1 ms, assumed */
>
> diff --git a/drivers/cpufreq/sh-cpufreq.c b/drivers/cpufreq/sh-cpufreq.c
> index 642ddb9ea217e..244153a1cead2 100644
> --- a/drivers/cpufreq/sh-cpufreq.c
> +++ b/drivers/cpufreq/sh-cpufreq.c
> @@ -124,9 +124,9 @@ static int sh_cpufreq_cpu_init(struct cpufreq_policy *policy)
> dev_notice(dev, "no frequency table found, falling back "
> "to rate rounding.\n");
>
> - policy->min = policy->cpuinfo.min_freq =
> + policy->cpuinfo.min_freq =
> (clk_round_rate(cpuclk, 1) + 500) / 1000;
> - policy->max = policy->cpuinfo.max_freq =
> + policy->cpuinfo.max_freq =
> (clk_round_rate(cpuclk, ~0UL) + 500) / 1000;
> }
>
> diff --git a/drivers/cpufreq/virtual-cpufreq.c b/drivers/cpufreq/virtual-cpufreq.c
> index 4159f31349b16..dc78b74409af4 100644
> --- a/drivers/cpufreq/virtual-cpufreq.c
> +++ b/drivers/cpufreq/virtual-cpufreq.c
> @@ -164,10 +164,7 @@ static int virt_cpufreq_get_freq_info(struct cpufreq_policy *policy)
> policy->cpuinfo.min_freq = 1;
> policy->cpuinfo.max_freq = virt_cpufreq_get_perftbl_entry(policy->cpu, 0);
>
> - policy->min = policy->cpuinfo.min_freq;
> - policy->max = policy->cpuinfo.max_freq;
> -
> - policy->cur = policy->max;
> + policy->cur = policy->cpuinfo.max_freq;
> return 0;
> }
>
--
Thx and BRs,
Zhongqiu Han
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
2026-04-29 13:00 ` Zhongqiu Han
@ 2026-04-30 13:41 ` Pierre Gondois
0 siblings, 0 replies; 8+ messages in thread
From: Pierre Gondois @ 2026-04-30 13:41 UTC (permalink / raw)
To: Zhongqiu Han, linux-kernel
Cc: Jie Zhan, Lifeng Zheng, Ionela Voinescu, Sumit Gupta, Huang Rui,
Mario Limonciello, Perry Yuan, K Prateek Nayak, Rafael J. Wysocki,
Viresh Kumar, Srinivas Pandruvada, Len Brown, Saravana Kannan,
linux-pm
Hello Zhongqiu,
On 4/29/26 15:00, Zhongqiu Han wrote:
> On 4/23/2026 4:47 PM, Pierre Gondois wrote:
>> cpufreq_set_policy() will ultimately override the policy min/max
>> values written in the .init() callback through:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-cpufreq_set_policy()
>> \-/* Set policy->min/max */
>> Thus the policy min/max values provided are only temporary.
>>
>> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-__cpufreq_driver_target()
>> \-cpufreq_driver->target()
>> is called. To avoid any regression, set policy->min/max in cpufreq.c
>> if the values were not initialized.
>>
>> In this patch:
>> - Setting policy->min or max value in driver .init() cb is
>> interpreted as setting a QoS constraint.
>> - Remove policy->min/max initialization in drivers if the values
>> are similar to policy->cpuinfo.min_freq/max_freq.
>> The only drivers where these values are different are:
>> - gx-suspmod.c
>> - cppc-cpufreq.c
>> - longrun.c
>> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>> used as a min QoS constraint as suggested at:
>> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>
>
> Hi Pierre,
> Thanks for the patch. I have a few additional inline comments/questions
> below.
>
>
>> ---
>> drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
>> drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
>> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
>> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
>> drivers/cpufreq/freq_table.c | 7 +++----
>> drivers/cpufreq/gx-suspmod.c | 9 +++++----
>> drivers/cpufreq/intel_pstate.c | 3 ---
>> drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
>> drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
>> drivers/cpufreq/sh-cpufreq.c | 4 ++--
>> drivers/cpufreq/virtual-cpufreq.c | 5 +----
>> 11 files changed, 51 insertions(+), 39 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 453084c67327f..1ed4bcdcc957f 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct
>> cpufreq_policy *policy)
>> perf = READ_ONCE(cpudata->perf);
>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> - cpudata->nominal_freq,
>> - perf.lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>> + policy->cpuinfo.min_freq = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.lowest_perf);
>> + policy->cpuinfo.max_freq = cpudata->max_freq;
>
>
> It is better to update doc as well to avoid new dirver developmenter set
> policy->min / policy->max again?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/cpu-drivers.rst#n102
>
Yes sure, does the following works for you:
(tabs might not be conserved in the mail)
diff --git a/Documentation/cpu-freq/cpu-drivers.rst
b/Documentation/cpu-freq/cpu-drivers.rst
index c5635ac3de547..12dc4dcbdd5a6 100644
--- a/Documentation/cpu-freq/cpu-drivers.rst
+++ b/Documentation/cpu-freq/cpu-drivers.rst
@@ -114,8 +114,14 @@ Then, the driver must fill in the following values:
|policy->cur | The current operating frequency
of |
| | this CPU (if appropriate) |
+-----------------------------------+--------------------------------------+
-|policy->min, | |
-|policy->max, | |
+|policy->min | Minimum frequency QoS constraint. |
+| | Can be overwritten by writing to |
+| | scaling_min sysfs file. |
++-----------------------------------+--------------------------------------+
+|policy->max | Maximum frequency QoS constraint. |
+| | Can be overwritten by writing to |
+| | scaling_max sysfs file. |
++-----------------------------------+--------------------------------------+
|policy->policy and, if necessary, | |
|policy->governor | must contain the "default policy"
for|
| | this CPU. A few moments later,
|
>
>> policy->driver_data = cpudata;
>> ret = amd_pstate_cppc_enable(policy);
>> @@ -1907,10 +1907,10 @@ static int amd_pstate_epp_cpu_init(struct
>> cpufreq_policy *policy)
>> perf = READ_ONCE(cpudata->perf);
>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> - cpudata->nominal_freq,
>> - perf.lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>> + policy->cpuinfo.min_freq = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.lowest_perf);
>> + policy->cpuinfo.max_freq = cpudata->max_freq;
>> policy->driver_data = cpudata;
>> ret = amd_pstate_cppc_enable(policy);
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct
>> cpufreq_policy *policy)
>> unsigned int cpu = policy->cpu;
>> struct cppc_cpudata *cpu_data;
>> struct cppc_perf_caps *caps;
>> + unsigned int min, max;
>> int ret;
>> cpu_data = cppc_cpufreq_get_cpu_data(cpu);
>> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct
>> cpufreq_policy *policy)
>> caps = &cpu_data->perf_caps;
>> policy->driver_data = cpu_data;
>> + min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> + max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> + caps->highest_perf : caps->nominal_perf);
>> +
>> /*
>> * Set min to lowest nonlinear perf to avoid any efficiency
>> penalty (see
>> * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>> */
>> - policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> - policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> - caps->highest_perf : caps->nominal_perf);
>> + policy->min = min;
>> /*
>> * Set cpuinfo.min_freq to Lowest to make the full range of
>> performance
>> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct
>> cpufreq_policy *policy)
>> * nonlinear perf
>> */
>> policy->cpuinfo.min_freq = cppc_perf_to_khz(caps,
>> caps->lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max;
>> + policy->cpuinfo.max_freq = max;
>> policy->transition_delay_us =
>> cppc_cpufreq_get_transition_delay_us(cpu);
>> policy->shared_type = cpu_data->shared_type;
>> diff --git a/drivers/cpufreq/cpufreq-nforce2.c
>> b/drivers/cpufreq/cpufreq-nforce2.c
>> index fbbbe501cf2dc..831102522ad64 100644
>> --- a/drivers/cpufreq/cpufreq-nforce2.c
>> +++ b/drivers/cpufreq/cpufreq-nforce2.c
>> @@ -355,8 +355,8 @@ static int nforce2_cpu_init(struct cpufreq_policy
>> *policy)
>> min_fsb = NFORCE2_MIN_FSB;
>> /* cpuinfo and default policy values */
>> - policy->min = policy->cpuinfo.min_freq = min_fsb * fid * 100;
>> - policy->max = policy->cpuinfo.max_freq = max_fsb * fid * 100;
>> + policy->cpuinfo.min_freq = min_fsb * fid * 100;
>> + policy->cpuinfo.max_freq = max_fsb * fid * 100;
>> return 0;
>> }
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct
>> cpufreq_policy *policy,
>> cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>> if (new_policy) {
>> + unsigned int min, max;
>> +
>> + /* Use policy->min/max set by the driver as QoS requests. */
>> + min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
>> + if (policy->max)
>> + max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
>> + else
>> + max = FREQ_QOS_MAX_DEFAULT_VALUE;
>
>
> Nit: Using local variables named min/max is confusing here since they
> shadow the common min()/max() macros; renaming them (e.g. min_freq
> / max_freq) would improve readability and maintainability.
>
Ok yes sure
>
>> for_each_cpu(j, policy->related_cpus) {
>> per_cpu(cpufreq_cpu_data, j) = policy;
>> add_cpu_dev_symlink(policy, j, get_cpu_device(j));
>> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct
>> cpufreq_policy *policy,
>> ret = freq_qos_add_request(&policy->constraints,
>> &policy->min_freq_req, FREQ_QOS_MIN,
>> - FREQ_QOS_MIN_DEFAULT_VALUE);
>> + min);
>
> It seems that the current patch is not merely a superficial cleanup; it
> also changes the policy->min value in the GX driver, setting it to the
> 5% value expected by the driver. If so, we should document it in the
> commit message.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/
> tree/drivers/cpufreq/gx-suspmod.c#n137
>
> /* gx-suspmod.c constants:
> * POLICY_MIN_DIV = 20
> * max_duration = 255
> * maxfreq = e.g. 300000 kHz (300 MHz)
> */
>
> cpufreq_policy_online()
> cpufreq_driver->init(policy) /* cpufreq_gx_cpu_init() */
> policy->min = maxfreq/20 /* 15000 kHz, 5% */
> cpuinfo.min_freq = maxfreq/255 /* 1176 kHz, 0.39% */
>
> /* Before current patch: 0, not policy->min */
> freq_qos_add_request(..., FREQ_QOS_MIN, 0)
>
> cpufreq_init_policy()
> cpufreq_set_policy()
> /* reads QoS=0, discards init()'s 15000 */
> new_data.min = freq_qos_read_value(FREQ_QOS_MIN)
> cpufreq_gx_verify()
> cpufreq_verify_within_cpu_limits()
> /* 0 < 1176: clamp to hw floor */
> new_data.min = cpuinfo.min_freq /* 1176 kHz */
> WRITE_ONCE(policy->min, 1176) /* 0.39%, not 5% */
>
> After current patch:
> freq_qos_add_request(..., FREQ_QOS_MIN, policy->min)
> => new_data.min stays 15000, no clamping, policy->min = 15000
>
Prior to [1], the policy->min/max values were used as QoS constraint.
This patch effectively set a min QoS constraint, but it should be no
different from what the driver was setting initially.
I will add a reference to the patch + explanation in the commit
message to avoid any ambiguities.
[1] 521223d8b3ec ("cpufreq: Fix initialization of min and max frequency
QoS requests")
>
>> if (ret < 0)
>> goto out_destroy_policy;
>> ret = freq_qos_add_request(&policy->constraints,
>> &policy->max_freq_req, FREQ_QOS_MAX,
>> - FREQ_QOS_MAX_DEFAULT_VALUE);
>> + max);
>> if (ret < 0)
>> goto out_destroy_policy;
>> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>> CPUFREQ_CREATE_POLICY, policy);
>> +
>> + /*
>> + * If the driver didn't set QoS constraints, policy->min/max
>> still
>> + * need to be set as they are used to clamp frequency requests.
>> + */
>> + policy->min = policy->min ? policy->min :
>> policy->cpuinfo.min_freq;
>> + policy->max = policy->max ? policy->max :
>> policy->cpuinfo.max_freq;
>
>
> Does it make sense to set policy->min / policy->max before the
> CPUFREQ_CREATE_POLICY notifier, since some drivers may use them in their
> callbacks?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/core.rst#n58
>
>
>
Yes right indeed, this would be better.
Thanks for the review
^ permalink raw reply related [flat|nested] 8+ messages in thread