Linux Power Management development
 help / color / mirror / Atom feed
From: Pierre Gondois <pierre.gondois@arm.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: linux-kernel@vger.kernel.org, Jie Zhan <zhanjie9@hisilicon.com>,
	Lifeng Zheng <zhenglifeng1@huawei.com>,
	Ionela Voinescu <ionela.voinescu@arm.com>,
	Sumit Gupta <sumitg@nvidia.com>, Huang Rui <ray.huang@amd.com>,
	Mario Limonciello <mario.limonciello@amd.com>,
	Perry Yuan <perry.yuan@amd.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Len Brown <lenb@kernel.org>,
	Saravana Kannan <saravanak@kernel.org>,
	linux-pm@vger.kernel.org
Subject: Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
Date: Mon, 11 May 2026 15:55:59 +0200	[thread overview]
Message-ID: <066010f3-9973-4076-9ea1-0c675dc84420@arm.com> (raw)
In-Reply-To: <6cq7e26levlhwh6aoq6aii3vsyjqtvp2ksymovkosfhuuunv6u@cyab7jjd2fgy>

Hello Viresh,

On 5/8/26 11:47, Viresh Kumar wrote:
> On 23-04-26, 10:47, Pierre Gondois wrote:
>> cpufreq_set_policy() will ultimately override the policy min/max
>> values written in the .init() callback through:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>>    \-cpufreq_set_policy()
>>      \-/* Set policy->min/max */
>> Thus the policy min/max values provided are only temporary.
>>
>> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>>    \-__cpufreq_driver_target()
>>      \-cpufreq_driver->target()
>> is called. To avoid any regression, set policy->min/max in cpufreq.c
>> if the values were not initialized.
>>
>> In this patch:
>> - Setting policy->min or max value in driver .init() cb is
>>    interpreted as setting a QoS constraint.
>> - Remove policy->min/max initialization in drivers if the values
>>    are similar to policy->cpuinfo.min_freq/max_freq.
>>    The only drivers where these values are different are:
>>    - gx-suspmod.c
>>    - cppc-cpufreq.c
>>    - longrun.c
>> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>>    used as a min QoS constraint as suggested at:
>>    https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>> ---
>>   drivers/cpufreq/amd-pstate.c      | 16 ++++++++--------
>>   drivers/cpufreq/cppc_cpufreq.c    | 11 +++++++----
>>   drivers/cpufreq/cpufreq-nforce2.c |  4 ++--
>>   drivers/cpufreq/cpufreq.c         | 19 +++++++++++++++++--
>>   drivers/cpufreq/freq_table.c      |  7 +++----
>>   drivers/cpufreq/gx-suspmod.c      |  9 +++++----
>>   drivers/cpufreq/intel_pstate.c    |  3 ---
>>   drivers/cpufreq/pcc-cpufreq.c     |  8 ++++----
>>   drivers/cpufreq/pxa3xx-cpufreq.c  |  4 ++--
>>   drivers/cpufreq/sh-cpufreq.c      |  4 ++--
>>   drivers/cpufreq/virtual-cpufreq.c |  5 +----
>>   11 files changed, 51 insertions(+), 39 deletions(-)
> I was looking for this series to be divided in 2-3 patches:
> - 1st patch would set policy->min/max from cpufreq_policy_online(), if they
>    aren't already set.
> - 2nd patch changes all drivers to not set it anymore, unless QoS thingy.
> - Last change talks about QoS and default values.
Ok
>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 453084c67327f..1ed4bcdcc957f 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>   
>>   	perf = READ_ONCE(cpudata->perf);
>>   
>> -	policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> -							      cpudata->nominal_freq,
>> -							      perf.lowest_perf);
>> -	policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>> +	policy->cpuinfo.min_freq = perf_to_freq(perf,
>> +						cpudata->nominal_freq,
>> +						perf.lowest_perf);
> This can come in two lines now instead of three. Similar issue at few more
> places.
Ok yes
>
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>   	unsigned int cpu = policy->cpu;
>>   	struct cppc_cpudata *cpu_data;
>>   	struct cppc_perf_caps *caps;
>> +	unsigned int min, max;
>>   	int ret;
>>   
>>   	cpu_data = cppc_cpufreq_get_cpu_data(cpu);
>> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>   	caps = &cpu_data->perf_caps;
>>   	policy->driver_data = cpu_data;
>>   
>> +	min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> +	max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> +			       caps->highest_perf : caps->nominal_perf);
>> +
>>   	/*
>>   	 * Set min to lowest nonlinear perf to avoid any efficiency penalty (see
>>   	 * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>>   	 */
>> -	policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> -	policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> -						caps->highest_perf : caps->nominal_perf);
>> +	policy->min = min;
>>   
>>   	/*
>>   	 * Set cpuinfo.min_freq to Lowest to make the full range of performance
>> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>   	 * nonlinear perf
>>   	 */
>>   	policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
>> -	policy->cpuinfo.max_freq = policy->max;
>> +	policy->cpuinfo.max_freq = max;
> I think the use of local variables isn't making it any better.
Ok sure
>
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 7e7f9dfb7a24..700abcb3e2e0 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -660,8 +660,6 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>           * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>           */
>          policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
> -       policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
> -                                               caps->highest_perf : caps->nominal_perf);
>
>          /*
>           * Set cpuinfo.min_freq to Lowest to make the full range of performance
> @@ -669,7 +667,8 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>           * nonlinear perf
>           */
>          policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
> -       policy->cpuinfo.max_freq = policy->max;
> +       policy->cpuinfo.max_freq = cppc_perf_to_khz(caps, policy->boost_enabled ?
> +                                  caps->highest_perf : caps->nominal_perf);
>
>          policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us(cpu);
>          policy->shared_type = cpu_data->shared_type;
>
>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>>   	cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>>   
>>   	if (new_policy) {
>> +		unsigned int min, max;
>> +
>> +		/* Use policy->min/max set by the driver as QoS requests. */
>> +		min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
>> +		if (policy->max)
>> +			max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
>> +		else
>> +			max = FREQ_QOS_MAX_DEFAULT_VALUE;
>>   		for_each_cpu(j, policy->related_cpus) {
>>   			per_cpu(cpufreq_cpu_data, j) = policy;
>>   			add_cpu_dev_symlink(policy, j, get_cpu_device(j));
>> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>>   
>>   		ret = freq_qos_add_request(&policy->constraints,
>>   					   &policy->min_freq_req, FREQ_QOS_MIN,
>> -					   FREQ_QOS_MIN_DEFAULT_VALUE);
>> +					   min);
>>   		if (ret < 0)
>>   			goto out_destroy_policy;
>>   
>>   		ret = freq_qos_add_request(&policy->constraints,
>>   					   &policy->max_freq_req, FREQ_QOS_MAX,
>> -					   FREQ_QOS_MAX_DEFAULT_VALUE);
>> +					   max);
>>   		if (ret < 0)
>>   			goto out_destroy_policy;
>>   
>>   		blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>>   				CPUFREQ_CREATE_POLICY, policy);
>> +
>> +		/*
>> +		 * If the driver didn't set QoS constraints, policy->min/max still
>> +		 * need to be set as they are used to clamp frequency requests.
>> +		 */
>> +		policy->min = policy->min ? policy->min : policy->cpuinfo.min_freq;
>> +		policy->max = policy->max ? policy->max : policy->cpuinfo.max_freq;
> Why aren't we doing this right after cpufreq_driver->init() ?
The above code can be split in 2 functionalities:
A. Adding QoS constraints
B. Fetching the min/max QoS constraint as initialized by the driver.

I think the main constraints are:
cpufreq_table_validate_and_sort()
\-cpufreq_frequency_table_cpuinfo() [1]
\-policy_has_boost_freq() [2]

[1] policy->cpuinfo.max_freq is updated
[2] Boost support is detected

So the QoS logic relying on policy->boost_supported
or policy->cpuinfo.max_freq (i.e. A. and B.) must be added
after cpufreq_table_validate_and_sort().

---

if (!new_policy && cpufreq_driver->online) {
} else {
   cpufreq_driver->init()
   cpufreq_table_validate_and_sort()
   // [3]
}
The else branch is executed if the policy is not new and there
is no .online() callback. To avoid initializing multiple times
the QoS constraints, A. can be done at [3] but only if
new_policy==true.

---

I should have sent a V2 which does everything in a
cpufreq_policy_init_qos(). Please let me know if this
seems reasonable to you.

      reply	other threads:[~2026-05-11 13:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23  8:47 [PATCH 0/1] cpufreq: Set policy->min and max as real QoS constraints Pierre Gondois
2026-04-23  8:47 ` [PATCH 1/1] " Pierre Gondois
2026-04-27  3:08   ` Jie Zhan
2026-04-30 13:41     ` Pierre Gondois
2026-04-28 16:37   ` Sumit Gupta
2026-04-30 13:41     ` Pierre Gondois
2026-04-29 13:00   ` Zhongqiu Han
2026-04-30 13:41     ` Pierre Gondois
2026-05-06 12:45       ` Zhongqiu Han
2026-05-08  9:47   ` Viresh Kumar
2026-05-11 13:55     ` Pierre Gondois [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=066010f3-9973-4076-9ea1-0c675dc84420@arm.com \
    --to=pierre.gondois@arm.com \
    --cc=ionela.voinescu@arm.com \
    --cc=kprateek.nayak@amd.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=perry.yuan@amd.com \
    --cc=rafael@kernel.org \
    --cc=ray.huang@amd.com \
    --cc=saravanak@kernel.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=sumitg@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox