From: Pierre Gondois <pierre.gondois@arm.com>
To: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>,
linux-kernel@vger.kernel.org
Cc: Jie Zhan <zhanjie9@hisilicon.com>,
Lifeng Zheng <zhenglifeng1@huawei.com>,
Ionela Voinescu <ionela.voinescu@arm.com>,
Sumit Gupta <sumitg@nvidia.com>, Huang Rui <ray.huang@amd.com>,
Mario Limonciello <mario.limonciello@amd.com>,
Perry Yuan <perry.yuan@amd.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
Len Brown <lenb@kernel.org>,
Saravana Kannan <saravanak@kernel.org>,
linux-pm@vger.kernel.org
Subject: Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
Date: Thu, 30 Apr 2026 15:41:10 +0200 [thread overview]
Message-ID: <3efe318a-52bf-4c92-8b86-03e0bb6a9a93@arm.com> (raw)
In-Reply-To: <73fac9ca-451d-49f0-b9c7-5ef6bc0119bf@oss.qualcomm.com>
Hello Zhongqiu,
On 4/29/26 15:00, Zhongqiu Han wrote:
> On 4/23/2026 4:47 PM, Pierre Gondois wrote:
>> cpufreq_set_policy() will ultimately override the policy min/max
>> values written in the .init() callback through:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-cpufreq_set_policy()
>> \-/* Set policy->min/max */
>> Thus the policy min/max values provided are only temporary.
>>
>> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-__cpufreq_driver_target()
>> \-cpufreq_driver->target()
>> is called. To avoid any regression, set policy->min/max in cpufreq.c
>> if the values were not initialized.
>>
>> In this patch:
>> - Setting policy->min or max value in driver .init() cb is
>> interpreted as setting a QoS constraint.
>> - Remove policy->min/max initialization in drivers if the values
>> are similar to policy->cpuinfo.min_freq/max_freq.
>> The only drivers where these values are different are:
>> - gx-suspmod.c
>> - cppc-cpufreq.c
>> - longrun.c
>> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>> used as a min QoS constraint as suggested at:
>> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>
>
> Hi Pierre,
> Thanks for the patch. I have a few additional inline comments/questions
> below.
>
>
>> ---
>> drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
>> drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
>> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
>> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
>> drivers/cpufreq/freq_table.c | 7 +++----
>> drivers/cpufreq/gx-suspmod.c | 9 +++++----
>> drivers/cpufreq/intel_pstate.c | 3 ---
>> drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
>> drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
>> drivers/cpufreq/sh-cpufreq.c | 4 ++--
>> drivers/cpufreq/virtual-cpufreq.c | 5 +----
>> 11 files changed, 51 insertions(+), 39 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 453084c67327f..1ed4bcdcc957f 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct
>> cpufreq_policy *policy)
>> perf = READ_ONCE(cpudata->perf);
>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> - cpudata->nominal_freq,
>> - perf.lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>> + policy->cpuinfo.min_freq = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.lowest_perf);
>> + policy->cpuinfo.max_freq = cpudata->max_freq;
>
>
> It is better to update doc as well to avoid new dirver developmenter set
> policy->min / policy->max again?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/cpu-drivers.rst#n102
>
Yes sure, does the following works for you:
(tabs might not be conserved in the mail)
diff --git a/Documentation/cpu-freq/cpu-drivers.rst
b/Documentation/cpu-freq/cpu-drivers.rst
index c5635ac3de547..12dc4dcbdd5a6 100644
--- a/Documentation/cpu-freq/cpu-drivers.rst
+++ b/Documentation/cpu-freq/cpu-drivers.rst
@@ -114,8 +114,14 @@ Then, the driver must fill in the following values:
|policy->cur | The current operating frequency
of |
| | this CPU (if appropriate) |
+-----------------------------------+--------------------------------------+
-|policy->min, | |
-|policy->max, | |
+|policy->min | Minimum frequency QoS constraint. |
+| | Can be overwritten by writing to |
+| | scaling_min sysfs file. |
++-----------------------------------+--------------------------------------+
+|policy->max | Maximum frequency QoS constraint. |
+| | Can be overwritten by writing to |
+| | scaling_max sysfs file. |
++-----------------------------------+--------------------------------------+
|policy->policy and, if necessary, | |
|policy->governor | must contain the "default policy"
for|
| | this CPU. A few moments later,
|
>
>> policy->driver_data = cpudata;
>> ret = amd_pstate_cppc_enable(policy);
>> @@ -1907,10 +1907,10 @@ static int amd_pstate_epp_cpu_init(struct
>> cpufreq_policy *policy)
>> perf = READ_ONCE(cpudata->perf);
>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> - cpudata->nominal_freq,
>> - perf.lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>> + policy->cpuinfo.min_freq = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.lowest_perf);
>> + policy->cpuinfo.max_freq = cpudata->max_freq;
>> policy->driver_data = cpudata;
>> ret = amd_pstate_cppc_enable(policy);
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct
>> cpufreq_policy *policy)
>> unsigned int cpu = policy->cpu;
>> struct cppc_cpudata *cpu_data;
>> struct cppc_perf_caps *caps;
>> + unsigned int min, max;
>> int ret;
>> cpu_data = cppc_cpufreq_get_cpu_data(cpu);
>> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct
>> cpufreq_policy *policy)
>> caps = &cpu_data->perf_caps;
>> policy->driver_data = cpu_data;
>> + min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> + max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> + caps->highest_perf : caps->nominal_perf);
>> +
>> /*
>> * Set min to lowest nonlinear perf to avoid any efficiency
>> penalty (see
>> * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>> */
>> - policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> - policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> - caps->highest_perf : caps->nominal_perf);
>> + policy->min = min;
>> /*
>> * Set cpuinfo.min_freq to Lowest to make the full range of
>> performance
>> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct
>> cpufreq_policy *policy)
>> * nonlinear perf
>> */
>> policy->cpuinfo.min_freq = cppc_perf_to_khz(caps,
>> caps->lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max;
>> + policy->cpuinfo.max_freq = max;
>> policy->transition_delay_us =
>> cppc_cpufreq_get_transition_delay_us(cpu);
>> policy->shared_type = cpu_data->shared_type;
>> diff --git a/drivers/cpufreq/cpufreq-nforce2.c
>> b/drivers/cpufreq/cpufreq-nforce2.c
>> index fbbbe501cf2dc..831102522ad64 100644
>> --- a/drivers/cpufreq/cpufreq-nforce2.c
>> +++ b/drivers/cpufreq/cpufreq-nforce2.c
>> @@ -355,8 +355,8 @@ static int nforce2_cpu_init(struct cpufreq_policy
>> *policy)
>> min_fsb = NFORCE2_MIN_FSB;
>> /* cpuinfo and default policy values */
>> - policy->min = policy->cpuinfo.min_freq = min_fsb * fid * 100;
>> - policy->max = policy->cpuinfo.max_freq = max_fsb * fid * 100;
>> + policy->cpuinfo.min_freq = min_fsb * fid * 100;
>> + policy->cpuinfo.max_freq = max_fsb * fid * 100;
>> return 0;
>> }
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct
>> cpufreq_policy *policy,
>> cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>> if (new_policy) {
>> + unsigned int min, max;
>> +
>> + /* Use policy->min/max set by the driver as QoS requests. */
>> + min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
>> + if (policy->max)
>> + max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
>> + else
>> + max = FREQ_QOS_MAX_DEFAULT_VALUE;
>
>
> Nit: Using local variables named min/max is confusing here since they
> shadow the common min()/max() macros; renaming them (e.g. min_freq
> / max_freq) would improve readability and maintainability.
>
Ok yes sure
>
>> for_each_cpu(j, policy->related_cpus) {
>> per_cpu(cpufreq_cpu_data, j) = policy;
>> add_cpu_dev_symlink(policy, j, get_cpu_device(j));
>> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct
>> cpufreq_policy *policy,
>> ret = freq_qos_add_request(&policy->constraints,
>> &policy->min_freq_req, FREQ_QOS_MIN,
>> - FREQ_QOS_MIN_DEFAULT_VALUE);
>> + min);
>
> It seems that the current patch is not merely a superficial cleanup; it
> also changes the policy->min value in the GX driver, setting it to the
> 5% value expected by the driver. If so, we should document it in the
> commit message.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/
> tree/drivers/cpufreq/gx-suspmod.c#n137
>
> /* gx-suspmod.c constants:
> * POLICY_MIN_DIV = 20
> * max_duration = 255
> * maxfreq = e.g. 300000 kHz (300 MHz)
> */
>
> cpufreq_policy_online()
> cpufreq_driver->init(policy) /* cpufreq_gx_cpu_init() */
> policy->min = maxfreq/20 /* 15000 kHz, 5% */
> cpuinfo.min_freq = maxfreq/255 /* 1176 kHz, 0.39% */
>
> /* Before current patch: 0, not policy->min */
> freq_qos_add_request(..., FREQ_QOS_MIN, 0)
>
> cpufreq_init_policy()
> cpufreq_set_policy()
> /* reads QoS=0, discards init()'s 15000 */
> new_data.min = freq_qos_read_value(FREQ_QOS_MIN)
> cpufreq_gx_verify()
> cpufreq_verify_within_cpu_limits()
> /* 0 < 1176: clamp to hw floor */
> new_data.min = cpuinfo.min_freq /* 1176 kHz */
> WRITE_ONCE(policy->min, 1176) /* 0.39%, not 5% */
>
> After current patch:
> freq_qos_add_request(..., FREQ_QOS_MIN, policy->min)
> => new_data.min stays 15000, no clamping, policy->min = 15000
>
Prior to [1], the policy->min/max values were used as QoS constraint.
This patch effectively set a min QoS constraint, but it should be no
different from what the driver was setting initially.
I will add a reference to the patch + explanation in the commit
message to avoid any ambiguities.
[1] 521223d8b3ec ("cpufreq: Fix initialization of min and max frequency
QoS requests")
>
>> if (ret < 0)
>> goto out_destroy_policy;
>> ret = freq_qos_add_request(&policy->constraints,
>> &policy->max_freq_req, FREQ_QOS_MAX,
>> - FREQ_QOS_MAX_DEFAULT_VALUE);
>> + max);
>> if (ret < 0)
>> goto out_destroy_policy;
>> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>> CPUFREQ_CREATE_POLICY, policy);
>> +
>> + /*
>> + * If the driver didn't set QoS constraints, policy->min/max
>> still
>> + * need to be set as they are used to clamp frequency requests.
>> + */
>> + policy->min = policy->min ? policy->min :
>> policy->cpuinfo.min_freq;
>> + policy->max = policy->max ? policy->max :
>> policy->cpuinfo.max_freq;
>
>
> Does it make sense to set policy->min / policy->max before the
> CPUFREQ_CREATE_POLICY notifier, since some drivers may use them in their
> callbacks?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/core.rst#n58
>
>
>
Yes right indeed, this would be better.
Thanks for the review
prev parent reply other threads:[~2026-04-30 13:42 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 8:47 [PATCH 0/1] cpufreq: Set policy->min and max as real QoS constraints Pierre Gondois
2026-04-23 8:47 ` [PATCH 1/1] " Pierre Gondois
2026-04-27 3:08 ` Jie Zhan
2026-04-30 13:41 ` Pierre Gondois
2026-04-28 16:37 ` Sumit Gupta
2026-04-30 13:41 ` Pierre Gondois
2026-04-29 13:00 ` Zhongqiu Han
2026-04-30 13:41 ` Pierre Gondois [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3efe318a-52bf-4c92-8b86-03e0bb6a9a93@arm.com \
--to=pierre.gondois@arm.com \
--cc=ionela.voinescu@arm.com \
--cc=kprateek.nayak@amd.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=perry.yuan@amd.com \
--cc=rafael@kernel.org \
--cc=ray.huang@amd.com \
--cc=saravanak@kernel.org \
--cc=srinivas.pandruvada@linux.intel.com \
--cc=sumitg@nvidia.com \
--cc=viresh.kumar@linaro.org \
--cc=zhanjie9@hisilicon.com \
--cc=zhenglifeng1@huawei.com \
--cc=zhongqiu.han@oss.qualcomm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox