From: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
To: Pierre Gondois <pierre.gondois@arm.com>, linux-kernel@vger.kernel.org
Cc: Jie Zhan <zhanjie9@hisilicon.com>,
Lifeng Zheng <zhenglifeng1@huawei.com>,
Ionela Voinescu <ionela.voinescu@arm.com>,
Sumit Gupta <sumitg@nvidia.com>, Huang Rui <ray.huang@amd.com>,
Mario Limonciello <mario.limonciello@amd.com>,
Perry Yuan <perry.yuan@amd.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Viresh Kumar <viresh.kumar@linaro.org>,
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
Len Brown <lenb@kernel.org>,
Saravana Kannan <saravanak@kernel.org>,
linux-pm@vger.kernel.org, zhongqiu.han@oss.qualcomm.com
Subject: Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
Date: Wed, 6 May 2026 20:45:47 +0800 [thread overview]
Message-ID: <e5a98d8b-607f-42d6-8e96-15d8b8f11d25@oss.qualcomm.com> (raw)
In-Reply-To: <3efe318a-52bf-4c92-8b86-03e0bb6a9a93@arm.com>
On 4/30/2026 9:41 PM, Pierre Gondois wrote:
> Hello Zhongqiu,
>
> On 4/29/26 15:00, Zhongqiu Han wrote:
>> On 4/23/2026 4:47 PM, Pierre Gondois wrote:
>>> cpufreq_set_policy() will ultimately override the policy min/max
>>> values written in the .init() callback through:
>>> cpufreq_policy_online()
>>> \-cpufreq_init_policy()
>>> \-cpufreq_set_policy()
>>> \-/* Set policy->min/max */
>>> Thus the policy min/max values provided are only temporary.
>>>
>>> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
>>> cpufreq_policy_online()
>>> \-cpufreq_init_policy()
>>> \-__cpufreq_driver_target()
>>> \-cpufreq_driver->target()
>>> is called. To avoid any regression, set policy->min/max in cpufreq.c
>>> if the values were not initialized.
>>>
>>> In this patch:
>>> - Setting policy->min or max value in driver .init() cb is
>>> interpreted as setting a QoS constraint.
>>> - Remove policy->min/max initialization in drivers if the values
>>> are similar to policy->cpuinfo.min_freq/max_freq.
>>> The only drivers where these values are different are:
>>> - gx-suspmod.c
>>> - cppc-cpufreq.c
>>> - longrun.c
>>> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>>> used as a min QoS constraint as suggested at:
>>> https://lore.kernel.org/lkml/20260213100633.15413-1-
>>> zhangpengjie2@huawei.com/
>>>
>>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>>
>>
>> Hi Pierre,
>> Thanks for the patch. I have a few additional inline comments/questions
>> below.
>>
>>
>>> ---
>>> drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
>>> drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
>>> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
>>> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
>>> drivers/cpufreq/freq_table.c | 7 +++----
>>> drivers/cpufreq/gx-suspmod.c | 9 +++++----
>>> drivers/cpufreq/intel_pstate.c | 3 ---
>>> drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
>>> drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
>>> drivers/cpufreq/sh-cpufreq.c | 4 ++--
>>> drivers/cpufreq/virtual-cpufreq.c | 5 +----
>>> 11 files changed, 51 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>>> index 453084c67327f..1ed4bcdcc957f 100644
>>> --- a/drivers/cpufreq/amd-pstate.c
>>> +++ b/drivers/cpufreq/amd-pstate.c
>>> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct
>>> cpufreq_policy *policy)
>>> perf = READ_ONCE(cpudata->perf);
>>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>>> - cpudata->nominal_freq,
>>> - perf.lowest_perf);
>>> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>>> + policy->cpuinfo.min_freq = perf_to_freq(perf,
>>> + cpudata->nominal_freq,
>>> + perf.lowest_perf);
>>> + policy->cpuinfo.max_freq = cpudata->max_freq;
>>
>>
>> It is better to update doc as well to avoid new dirver developmenter set
>> policy->min / policy->max again?
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/
>> tree/Documentation/cpu-freq/cpu-drivers.rst#n102
>
> Yes sure, does the following works for you:
> (tabs might not be conserved in the mail)
>
> diff --git a/Documentation/cpu-freq/cpu-drivers.rst b/Documentation/cpu-
> freq/cpu-drivers.rst
> index c5635ac3de547..12dc4dcbdd5a6 100644
> --- a/Documentation/cpu-freq/cpu-drivers.rst
> +++ b/Documentation/cpu-freq/cpu-drivers.rst
> @@ -114,8 +114,14 @@ Then, the driver must fill in the following values:
> |policy->cur | The current operating frequency
> of |
> | | this CPU (if appropriate) |
> +-----------------------------------
> +--------------------------------------+
> -|policy->min, | |
> -|policy->max, | |
> +|policy->min | Minimum frequency QoS constraint.
> |
> +| | Can be overwritten by writing to
> |
> +| | scaling_min sysfs file. |
> ++-----------------------------------
> +--------------------------------------+
> +|policy->max | Maximum frequency QoS constraint.
> |
> +| | Can be overwritten by writing to
> |
> +| | scaling_max sysfs file. |
> ++-----------------------------------
> +--------------------------------------+
> |policy->policy and, if necessary, | |
> |policy->governor | must contain the "default policy"
> for|
> | | this CPU. A few moments later, |
>
Thanks Pierre,
Just shared a minor nit and used 'If' to indicate that it's optional.
For example, if you think it makes sense, you may consider:
+|policy->min | If set by the driver in ->init(), used as the |
+| | initial minimum frequency QoS request. |
++-------------------------------------------------------------------+
+|policy->max | If set by the driver in ->init(), used as the |
+| | initial maximum frequency QoS request. |
>>
>>> policy->driver_data = cpudata;
>>> ret = amd_pstate_cppc_enable(policy);
>>> @@ -1907,10 +1907,10 @@ static int amd_pstate_epp_cpu_init(struct
>>> cpufreq_policy *policy)
>>> perf = READ_ONCE(cpudata->perf);
>>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>>> - cpudata->nominal_freq,
>>> - perf.lowest_perf);
>>> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>>> + policy->cpuinfo.min_freq = perf_to_freq(perf,
>>> + cpudata->nominal_freq,
>>> + perf.lowest_perf);
>>> + policy->cpuinfo.max_freq = cpudata->max_freq;
>>> policy->driver_data = cpudata;
>>> ret = amd_pstate_cppc_enable(policy);
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/
>>> cppc_cpufreq.c
>>> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct
>>> cpufreq_policy *policy)
>>> unsigned int cpu = policy->cpu;
>>> struct cppc_cpudata *cpu_data;
>>> struct cppc_perf_caps *caps;
>>> + unsigned int min, max;
>>> int ret;
>>> cpu_data = cppc_cpufreq_get_cpu_data(cpu);
>>> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct
>>> cpufreq_policy *policy)
>>> caps = &cpu_data->perf_caps;
>>> policy->driver_data = cpu_data;
>>> + min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>>> + max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>>> + caps->highest_perf : caps->nominal_perf);
>>> +
>>> /*
>>> * Set min to lowest nonlinear perf to avoid any efficiency
>>> penalty (see
>>> * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>>> */
>>> - policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>>> - policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>>> - caps->highest_perf : caps->nominal_perf);
>>> + policy->min = min;
>>> /*
>>> * Set cpuinfo.min_freq to Lowest to make the full range of
>>> performance
>>> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct
>>> cpufreq_policy *policy)
>>> * nonlinear perf
>>> */
>>> policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps-
>>> >lowest_perf);
>>> - policy->cpuinfo.max_freq = policy->max;
>>> + policy->cpuinfo.max_freq = max;
>>> policy->transition_delay_us =
>>> cppc_cpufreq_get_transition_delay_us(cpu);
>>> policy->shared_type = cpu_data->shared_type;
>>> diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/
>>> cpufreq-nforce2.c
>>> index fbbbe501cf2dc..831102522ad64 100644
>>> --- a/drivers/cpufreq/cpufreq-nforce2.c
>>> +++ b/drivers/cpufreq/cpufreq-nforce2.c
>>> @@ -355,8 +355,8 @@ static int nforce2_cpu_init(struct cpufreq_policy
>>> *policy)
>>> min_fsb = NFORCE2_MIN_FSB;
>>> /* cpuinfo and default policy values */
>>> - policy->min = policy->cpuinfo.min_freq = min_fsb * fid * 100;
>>> - policy->max = policy->cpuinfo.max_freq = max_fsb * fid * 100;
>>> + policy->cpuinfo.min_freq = min_fsb * fid * 100;
>>> + policy->cpuinfo.max_freq = max_fsb * fid * 100;
>>> return 0;
>>> }
>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
>>> --- a/drivers/cpufreq/cpufreq.c
>>> +++ b/drivers/cpufreq/cpufreq.c
>>> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct
>>> cpufreq_policy *policy,
>>> cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>>> if (new_policy) {
>>> + unsigned int min, max;
>>> +
>>> + /* Use policy->min/max set by the driver as QoS requests. */
>>> + min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
>>> + if (policy->max)
>>> + max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
>>> + else
>>> + max = FREQ_QOS_MAX_DEFAULT_VALUE;
>>
>>
>> Nit: Using local variables named min/max is confusing here since they
>> shadow the common min()/max() macros; renaming them (e.g. min_freq
>> / max_freq) would improve readability and maintainability.
>>
> Ok yes sure
>>
>>> for_each_cpu(j, policy->related_cpus) {
>>> per_cpu(cpufreq_cpu_data, j) = policy;
>>> add_cpu_dev_symlink(policy, j, get_cpu_device(j));
>>> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct
>>> cpufreq_policy *policy,
>>> ret = freq_qos_add_request(&policy->constraints,
>>> &policy->min_freq_req, FREQ_QOS_MIN,
>>> - FREQ_QOS_MIN_DEFAULT_VALUE);
>>> + min);
>>
>> It seems that the current patch is not merely a superficial cleanup; it
>> also changes the policy->min value in the GX driver, setting it to the
>> 5% value expected by the driver. If so, we should document it in the
>> commit message.
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/
>> tree/drivers/cpufreq/gx-suspmod.c#n137
>>
>> /* gx-suspmod.c constants:
>> * POLICY_MIN_DIV = 20
>> * max_duration = 255
>> * maxfreq = e.g. 300000 kHz (300 MHz)
>> */
>>
>> cpufreq_policy_online()
>> cpufreq_driver->init(policy) /* cpufreq_gx_cpu_init() */
>> policy->min = maxfreq/20 /* 15000 kHz, 5% */
>> cpuinfo.min_freq = maxfreq/255 /* 1176 kHz, 0.39% */
>>
>> /* Before current patch: 0, not policy->min */
>> freq_qos_add_request(..., FREQ_QOS_MIN, 0)
>>
>> cpufreq_init_policy()
>> cpufreq_set_policy()
>> /* reads QoS=0, discards init()'s 15000 */
>> new_data.min = freq_qos_read_value(FREQ_QOS_MIN)
>> cpufreq_gx_verify()
>> cpufreq_verify_within_cpu_limits()
>> /* 0 < 1176: clamp to hw floor */
>> new_data.min = cpuinfo.min_freq /* 1176 kHz */
>> WRITE_ONCE(policy->min, 1176) /* 0.39%, not 5% */
>>
>> After current patch:
>> freq_qos_add_request(..., FREQ_QOS_MIN, policy->min)
>> => new_data.min stays 15000, no clamping, policy->min = 15000
>>
> Prior to [1], the policy->min/max values were used as QoS constraint.
> This patch effectively set a min QoS constraint, but it should be no
> different from what the driver was setting initially.
>
> I will add a reference to the patch + explanation in the commit
> message to avoid any ambiguities.
>
> [1] 521223d8b3ec ("cpufreq: Fix initialization of min and max frequency
> QoS requests")
>
>
Thanks, that makes sense to me.
>>
>>> if (ret < 0)
>>> goto out_destroy_policy;
>>> ret = freq_qos_add_request(&policy->constraints,
>>> &policy->max_freq_req, FREQ_QOS_MAX,
>>> - FREQ_QOS_MAX_DEFAULT_VALUE);
>>> + max);
>>> if (ret < 0)
>>> goto out_destroy_policy;
>>> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>>> CPUFREQ_CREATE_POLICY, policy);
>>> +
>>> + /*
>>> + * If the driver didn't set QoS constraints, policy->min/max
>>> still
>>> + * need to be set as they are used to clamp frequency requests.
>>> + */
>>> + policy->min = policy->min ? policy->min : policy-
>>> >cpuinfo.min_freq;
>>> + policy->max = policy->max ? policy->max : policy-
>>> >cpuinfo.max_freq;
>>
>>
>> Does it make sense to set policy->min / policy->max before the
>> CPUFREQ_CREATE_POLICY notifier, since some drivers may use them in their
>> callbacks?
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/
>> tree/Documentation/cpu-freq/core.rst#n58
>>
>>
> Yes right indeed, this would be better.
>
> Thanks for the review
>
--
Thx and BRs,
Zhongqiu Han
prev parent reply other threads:[~2026-05-06 12:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 8:47 [PATCH 0/1] cpufreq: Set policy->min and max as real QoS constraints Pierre Gondois
2026-04-23 8:47 ` [PATCH 1/1] " Pierre Gondois
2026-04-27 3:08 ` Jie Zhan
2026-04-30 13:41 ` Pierre Gondois
2026-04-28 16:37 ` Sumit Gupta
2026-04-30 13:41 ` Pierre Gondois
2026-04-29 13:00 ` Zhongqiu Han
2026-04-30 13:41 ` Pierre Gondois
2026-05-06 12:45 ` Zhongqiu Han [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e5a98d8b-607f-42d6-8e96-15d8b8f11d25@oss.qualcomm.com \
--to=zhongqiu.han@oss.qualcomm.com \
--cc=ionela.voinescu@arm.com \
--cc=kprateek.nayak@amd.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=perry.yuan@amd.com \
--cc=pierre.gondois@arm.com \
--cc=rafael@kernel.org \
--cc=ray.huang@amd.com \
--cc=saravanak@kernel.org \
--cc=srinivas.pandruvada@linux.intel.com \
--cc=sumitg@nvidia.com \
--cc=viresh.kumar@linaro.org \
--cc=zhanjie9@hisilicon.com \
--cc=zhenglifeng1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox