From: Pierre Gondois <pierre.gondois@arm.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-kernel@vger.kernel.org, Jie Zhan <zhanjie9@hisilicon.com>,
Lifeng Zheng <zhenglifeng1@huawei.com>,
Ionela Voinescu <ionela.voinescu@arm.com>,
Sumit Gupta <sumitg@nvidia.com>, Huang Rui <ray.huang@amd.com>,
Mario Limonciello <mario.limonciello@amd.com>,
Perry Yuan <perry.yuan@amd.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
"Rafael J. Wysocki" <rafael@kernel.org>,
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
Len Brown <lenb@kernel.org>,
Saravana Kannan <saravanak@kernel.org>,
linux-pm@vger.kernel.org
Subject: Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
Date: Mon, 11 May 2026 15:55:59 +0200 [thread overview]
Message-ID: <066010f3-9973-4076-9ea1-0c675dc84420@arm.com> (raw)
In-Reply-To: <6cq7e26levlhwh6aoq6aii3vsyjqtvp2ksymovkosfhuuunv6u@cyab7jjd2fgy>
Hello Viresh,
On 5/8/26 11:47, Viresh Kumar wrote:
> On 23-04-26, 10:47, Pierre Gondois wrote:
>> cpufreq_set_policy() will ultimately override the policy min/max
>> values written in the .init() callback through:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-cpufreq_set_policy()
>> \-/* Set policy->min/max */
>> Thus the policy min/max values provided are only temporary.
>>
>> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>> \-__cpufreq_driver_target()
>> \-cpufreq_driver->target()
>> is called. To avoid any regression, set policy->min/max in cpufreq.c
>> if the values were not initialized.
>>
>> In this patch:
>> - Setting policy->min or max value in driver .init() cb is
>> interpreted as setting a QoS constraint.
>> - Remove policy->min/max initialization in drivers if the values
>> are similar to policy->cpuinfo.min_freq/max_freq.
>> The only drivers where these values are different are:
>> - gx-suspmod.c
>> - cppc-cpufreq.c
>> - longrun.c
>> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>> used as a min QoS constraint as suggested at:
>> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>> ---
>> drivers/cpufreq/amd-pstate.c | 16 ++++++++--------
>> drivers/cpufreq/cppc_cpufreq.c | 11 +++++++----
>> drivers/cpufreq/cpufreq-nforce2.c | 4 ++--
>> drivers/cpufreq/cpufreq.c | 19 +++++++++++++++++--
>> drivers/cpufreq/freq_table.c | 7 +++----
>> drivers/cpufreq/gx-suspmod.c | 9 +++++----
>> drivers/cpufreq/intel_pstate.c | 3 ---
>> drivers/cpufreq/pcc-cpufreq.c | 8 ++++----
>> drivers/cpufreq/pxa3xx-cpufreq.c | 4 ++--
>> drivers/cpufreq/sh-cpufreq.c | 4 ++--
>> drivers/cpufreq/virtual-cpufreq.c | 5 +----
>> 11 files changed, 51 insertions(+), 39 deletions(-)
> I was looking for this series to be divided in 2-3 patches:
> - 1st patch would set policy->min/max from cpufreq_policy_online(), if they
> aren't already set.
> - 2nd patch changes all drivers to not set it anymore, unless QoS thingy.
> - Last change talks about QoS and default values.
Ok
>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 453084c67327f..1ed4bcdcc957f 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>
>> perf = READ_ONCE(cpudata->perf);
>>
>> - policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> - cpudata->nominal_freq,
>> - perf.lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>> + policy->cpuinfo.min_freq = perf_to_freq(perf,
>> + cpudata->nominal_freq,
>> + perf.lowest_perf);
> This can come in two lines now instead of three. Similar issue at few more
> places.
Ok yes
>
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>> unsigned int cpu = policy->cpu;
>> struct cppc_cpudata *cpu_data;
>> struct cppc_perf_caps *caps;
>> + unsigned int min, max;
>> int ret;
>>
>> cpu_data = cppc_cpufreq_get_cpu_data(cpu);
>> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>> caps = &cpu_data->perf_caps;
>> policy->driver_data = cpu_data;
>>
>> + min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> + max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> + caps->highest_perf : caps->nominal_perf);
>> +
>> /*
>> * Set min to lowest nonlinear perf to avoid any efficiency penalty (see
>> * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>> */
>> - policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> - policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> - caps->highest_perf : caps->nominal_perf);
>> + policy->min = min;
>>
>> /*
>> * Set cpuinfo.min_freq to Lowest to make the full range of performance
>> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>> * nonlinear perf
>> */
>> policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
>> - policy->cpuinfo.max_freq = policy->max;
>> + policy->cpuinfo.max_freq = max;
> I think the use of local variables isn't making it any better.
Ok sure
>
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 7e7f9dfb7a24..700abcb3e2e0 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -660,8 +660,6 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
> */
> policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
> - policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
> - caps->highest_perf : caps->nominal_perf);
>
> /*
> * Set cpuinfo.min_freq to Lowest to make the full range of performance
> @@ -669,7 +667,8 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
> * nonlinear perf
> */
> policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
> - policy->cpuinfo.max_freq = policy->max;
> + policy->cpuinfo.max_freq = cppc_perf_to_khz(caps, policy->boost_enabled ?
> + caps->highest_perf : caps->nominal_perf);
>
> policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us(cpu);
> policy->shared_type = cpu_data->shared_type;
>
>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>> cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>>
>> if (new_policy) {
>> + unsigned int min, max;
>> +
>> + /* Use policy->min/max set by the driver as QoS requests. */
>> + min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
>> + if (policy->max)
>> + max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
>> + else
>> + max = FREQ_QOS_MAX_DEFAULT_VALUE;
>> for_each_cpu(j, policy->related_cpus) {
>> per_cpu(cpufreq_cpu_data, j) = policy;
>> add_cpu_dev_symlink(policy, j, get_cpu_device(j));
>> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>>
>> ret = freq_qos_add_request(&policy->constraints,
>> &policy->min_freq_req, FREQ_QOS_MIN,
>> - FREQ_QOS_MIN_DEFAULT_VALUE);
>> + min);
>> if (ret < 0)
>> goto out_destroy_policy;
>>
>> ret = freq_qos_add_request(&policy->constraints,
>> &policy->max_freq_req, FREQ_QOS_MAX,
>> - FREQ_QOS_MAX_DEFAULT_VALUE);
>> + max);
>> if (ret < 0)
>> goto out_destroy_policy;
>>
>> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>> CPUFREQ_CREATE_POLICY, policy);
>> +
>> + /*
>> + * If the driver didn't set QoS constraints, policy->min/max still
>> + * need to be set as they are used to clamp frequency requests.
>> + */
>> + policy->min = policy->min ? policy->min : policy->cpuinfo.min_freq;
>> + policy->max = policy->max ? policy->max : policy->cpuinfo.max_freq;
> Why aren't we doing this right after cpufreq_driver->init() ?
The above code can be split in 2 functionalities:
A. Adding QoS constraints
B. Fetching the min/max QoS constraint as initialized by the driver.
I think the main constraints are:
cpufreq_table_validate_and_sort()
\-cpufreq_frequency_table_cpuinfo() [1]
\-policy_has_boost_freq() [2]
[1] policy->cpuinfo.max_freq is updated
[2] Boost support is detected
So the QoS logic relying on policy->boost_supported
or policy->cpuinfo.max_freq (i.e. A. and B.) must be added
after cpufreq_table_validate_and_sort().
---
if (!new_policy && cpufreq_driver->online) {
} else {
cpufreq_driver->init()
cpufreq_table_validate_and_sort()
// [3]
}
The else branch is executed if the policy is not new and there
is no .online() callback. To avoid initializing multiple times
the QoS constraints, A. can be done at [3] but only if
new_policy==true.
---
I should have sent a V2 which does everything in a
cpufreq_policy_init_qos(). Please let me know if this
seems reasonable to you.
prev parent reply other threads:[~2026-05-11 13:57 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 8:47 [PATCH 0/1] cpufreq: Set policy->min and max as real QoS constraints Pierre Gondois
2026-04-23 8:47 ` [PATCH 1/1] " Pierre Gondois
2026-04-27 3:08 ` Jie Zhan
2026-04-30 13:41 ` Pierre Gondois
2026-04-28 16:37 ` Sumit Gupta
2026-04-30 13:41 ` Pierre Gondois
2026-04-29 13:00 ` Zhongqiu Han
2026-04-30 13:41 ` Pierre Gondois
2026-05-06 12:45 ` Zhongqiu Han
2026-05-08 9:47 ` Viresh Kumar
2026-05-11 13:55 ` Pierre Gondois [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=066010f3-9973-4076-9ea1-0c675dc84420@arm.com \
--to=pierre.gondois@arm.com \
--cc=ionela.voinescu@arm.com \
--cc=kprateek.nayak@amd.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=perry.yuan@amd.com \
--cc=rafael@kernel.org \
--cc=ray.huang@amd.com \
--cc=saravanak@kernel.org \
--cc=srinivas.pandruvada@linux.intel.com \
--cc=sumitg@nvidia.com \
--cc=viresh.kumar@linaro.org \
--cc=zhanjie9@hisilicon.com \
--cc=zhenglifeng1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox