public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Pierre Gondois <pierre.gondois@arm.com>
To: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>,
	linux-kernel@vger.kernel.org
Cc: Jie Zhan <zhanjie9@hisilicon.com>,
	Lifeng Zheng <zhenglifeng1@huawei.com>,
	Ionela Voinescu <ionela.voinescu@arm.com>,
	Sumit Gupta <sumitg@nvidia.com>, Huang Rui <ray.huang@amd.com>,
	Mario Limonciello <mario.limonciello@amd.com>,
	Perry Yuan <perry.yuan@amd.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Len Brown <lenb@kernel.org>,
	Saravana Kannan <saravanak@kernel.org>,
	linux-pm@vger.kernel.org
Subject: Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
Date: Thu, 30 Apr 2026 15:41:10 +0200	[thread overview]
Message-ID: <3efe318a-52bf-4c92-8b86-03e0bb6a9a93@arm.com> (raw)
In-Reply-To: <73fac9ca-451d-49f0-b9c7-5ef6bc0119bf@oss.qualcomm.com>

Hello Zhongqiu,

On 4/29/26 15:00, Zhongqiu Han wrote:
> On 4/23/2026 4:47 PM, Pierre Gondois wrote:
>> cpufreq_set_policy() will ultimately override the policy min/max
>> values written in the .init() callback through:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>>    \-cpufreq_set_policy()
>>      \-/* Set policy->min/max */
>> Thus the policy min/max values provided are only temporary.
>>
>> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
>> cpufreq_policy_online()
>> \-cpufreq_init_policy()
>>    \-__cpufreq_driver_target()
>>      \-cpufreq_driver->target()
>> is called. To avoid any regression, set policy->min/max in cpufreq.c
>> if the values were not initialized.
>>
>> In this patch:
>> - Setting policy->min or max value in driver .init() cb is
>>    interpreted as setting a QoS constraint.
>> - Remove policy->min/max initialization in drivers if the values
>>    are similar to policy->cpuinfo.min_freq/max_freq.
>>    The only drivers where these values are different are:
>>    - gx-suspmod.c
>>    - cppc-cpufreq.c
>>    - longrun.c
>> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>>    used as a min QoS constraint as suggested at:
>> https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>
>
> Hi Pierre,
> Thanks for the patch. I have a few additional inline comments/questions
> below.
>
>
>> ---
>>   drivers/cpufreq/amd-pstate.c      | 16 ++++++++--------
>>   drivers/cpufreq/cppc_cpufreq.c    | 11 +++++++----
>>   drivers/cpufreq/cpufreq-nforce2.c |  4 ++--
>>   drivers/cpufreq/cpufreq.c         | 19 +++++++++++++++++--
>>   drivers/cpufreq/freq_table.c      |  7 +++----
>>   drivers/cpufreq/gx-suspmod.c      |  9 +++++----
>>   drivers/cpufreq/intel_pstate.c    |  3 ---
>>   drivers/cpufreq/pcc-cpufreq.c     |  8 ++++----
>>   drivers/cpufreq/pxa3xx-cpufreq.c  |  4 ++--
>>   drivers/cpufreq/sh-cpufreq.c      |  4 ++--
>>   drivers/cpufreq/virtual-cpufreq.c |  5 +----
>>   11 files changed, 51 insertions(+), 39 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 453084c67327f..1ed4bcdcc957f 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct 
>> cpufreq_policy *policy)
>>         perf = READ_ONCE(cpudata->perf);
>>   -    policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> -                                  cpudata->nominal_freq,
>> -                                  perf.lowest_perf);
>> -    policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>> +    policy->cpuinfo.min_freq = perf_to_freq(perf,
>> +                        cpudata->nominal_freq,
>> +                        perf.lowest_perf);
>> +    policy->cpuinfo.max_freq = cpudata->max_freq;
>
>
> It is better to update doc as well to avoid new dirver developmenter set
> policy->min / policy->max again?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/cpu-drivers.rst#n102 
>

Yes sure, does the following works for you:
(tabs might not be conserved in the mail)

diff --git a/Documentation/cpu-freq/cpu-drivers.rst 
b/Documentation/cpu-freq/cpu-drivers.rst
index c5635ac3de547..12dc4dcbdd5a6 100644
--- a/Documentation/cpu-freq/cpu-drivers.rst
+++ b/Documentation/cpu-freq/cpu-drivers.rst
@@ -114,8 +114,14 @@ Then, the driver must fill in the following values:
  |policy->cur                       | The current operating frequency 
of   |
  |                                  | this CPU (if appropriate)         |
  +-----------------------------------+--------------------------------------+
-|policy->min,                      |             |
-|policy->max,                      |             |
+|policy->min                       | Minimum frequency QoS constraint.    |
+|                                  | Can be overwritten by writing to     |
+|                                  | scaling_min sysfs file.         |
++-----------------------------------+--------------------------------------+
+|policy->max                       | Maximum frequency QoS constraint.    |
+|                                  | Can be overwritten by writing to     |
+|                                  | scaling_max sysfs file.         |
++-----------------------------------+--------------------------------------+
  |policy->policy and, if necessary,  |            |
  |policy->governor                  | must contain the "default policy" 
for|
  |                                  | this CPU. A few moments later,    
    |

>
>>         policy->driver_data = cpudata;
>>       ret = amd_pstate_cppc_enable(policy);
>> @@ -1907,10 +1907,10 @@ static int amd_pstate_epp_cpu_init(struct 
>> cpufreq_policy *policy)
>>         perf = READ_ONCE(cpudata->perf);
>>   -    policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
>> -                                  cpudata->nominal_freq,
>> -                                  perf.lowest_perf);
>> -    policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
>> +    policy->cpuinfo.min_freq = perf_to_freq(perf,
>> +                        cpudata->nominal_freq,
>> +                        perf.lowest_perf);
>> +    policy->cpuinfo.max_freq = cpudata->max_freq;
>>       policy->driver_data = cpudata;
>>         ret = amd_pstate_cppc_enable(policy);
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c 
>> b/drivers/cpufreq/cppc_cpufreq.c
>> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct 
>> cpufreq_policy *policy)
>>       unsigned int cpu = policy->cpu;
>>       struct cppc_cpudata *cpu_data;
>>       struct cppc_perf_caps *caps;
>> +    unsigned int min, max;
>>       int ret;
>>         cpu_data = cppc_cpufreq_get_cpu_data(cpu);
>> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct 
>> cpufreq_policy *policy)
>>       caps = &cpu_data->perf_caps;
>>       policy->driver_data = cpu_data;
>>   +    min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> +    max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> +                   caps->highest_perf : caps->nominal_perf);
>> +
>>       /*
>>        * Set min to lowest nonlinear perf to avoid any efficiency 
>> penalty (see
>>        * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>>        */
>> -    policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
>> -    policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
>> -                        caps->highest_perf : caps->nominal_perf);
>> +    policy->min = min;
>>         /*
>>        * Set cpuinfo.min_freq to Lowest to make the full range of 
>> performance
>> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct 
>> cpufreq_policy *policy)
>>        * nonlinear perf
>>        */
>>       policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, 
>> caps->lowest_perf);
>> -    policy->cpuinfo.max_freq = policy->max;
>> +    policy->cpuinfo.max_freq = max;
>>         policy->transition_delay_us = 
>> cppc_cpufreq_get_transition_delay_us(cpu);
>>       policy->shared_type = cpu_data->shared_type;
>> diff --git a/drivers/cpufreq/cpufreq-nforce2.c 
>> b/drivers/cpufreq/cpufreq-nforce2.c
>> index fbbbe501cf2dc..831102522ad64 100644
>> --- a/drivers/cpufreq/cpufreq-nforce2.c
>> +++ b/drivers/cpufreq/cpufreq-nforce2.c
>> @@ -355,8 +355,8 @@ static int nforce2_cpu_init(struct cpufreq_policy 
>> *policy)
>>           min_fsb = NFORCE2_MIN_FSB;
>>         /* cpuinfo and default policy values */
>> -    policy->min = policy->cpuinfo.min_freq = min_fsb * fid * 100;
>> -    policy->max = policy->cpuinfo.max_freq = max_fsb * fid * 100;
>> +    policy->cpuinfo.min_freq = min_fsb * fid * 100;
>> +    policy->cpuinfo.max_freq = max_fsb * fid * 100;
>>         return 0;
>>   }
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct 
>> cpufreq_policy *policy,
>>       cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>>         if (new_policy) {
>> +        unsigned int min, max;
>> +
>> +        /* Use policy->min/max set by the driver as QoS requests. */
>> +        min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
>> +        if (policy->max)
>> +            max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
>> +        else
>> +            max = FREQ_QOS_MAX_DEFAULT_VALUE;
>
>
> Nit: Using local variables named min/max is confusing here since they
> shadow the common min()/max() macros; renaming them (e.g. min_freq
> / max_freq) would improve readability and maintainability.
>
Ok yes sure
>
>>           for_each_cpu(j, policy->related_cpus) {
>>               per_cpu(cpufreq_cpu_data, j) = policy;
>>               add_cpu_dev_symlink(policy, j, get_cpu_device(j));
>> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct 
>> cpufreq_policy *policy,
>>             ret = freq_qos_add_request(&policy->constraints,
>>                          &policy->min_freq_req, FREQ_QOS_MIN,
>> -                       FREQ_QOS_MIN_DEFAULT_VALUE);
>> +                       min);
>
> It seems that the current patch is not merely a superficial cleanup; it
> also changes the policy->min value in the GX driver, setting it to the
> 5% value expected by the driver. If so, we should document it in the
> commit message.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/
> tree/drivers/cpufreq/gx-suspmod.c#n137
>
> /* gx-suspmod.c constants:
>  *   POLICY_MIN_DIV = 20
>  *   max_duration   = 255
>  *   maxfreq = e.g. 300000 kHz (300 MHz)
>  */
>
> cpufreq_policy_online()
>   cpufreq_driver->init(policy)      /* cpufreq_gx_cpu_init() */
>     policy->min = maxfreq/20        /* 15000 kHz, 5% */
>     cpuinfo.min_freq = maxfreq/255  /* 1176 kHz, 0.39% */
>
>   /* Before current patch: 0, not policy->min */
>   freq_qos_add_request(..., FREQ_QOS_MIN, 0)
>
>   cpufreq_init_policy()
>     cpufreq_set_policy()
>       /* reads QoS=0, discards init()'s 15000 */
>       new_data.min = freq_qos_read_value(FREQ_QOS_MIN)
>       cpufreq_gx_verify()
>         cpufreq_verify_within_cpu_limits()
>           /* 0 < 1176: clamp to hw floor */
>           new_data.min = cpuinfo.min_freq  /* 1176 kHz */
>       WRITE_ONCE(policy->min, 1176)  /* 0.39%, not 5% */
>
> After current patch:
> freq_qos_add_request(..., FREQ_QOS_MIN, policy->min)
>   => new_data.min stays 15000, no clamping, policy->min = 15000
>
Prior to [1], the policy->min/max values were used as QoS constraint.
This patch effectively set a min QoS constraint, but it should be no
different from what the driver was setting initially.

I will add a reference to the patch + explanation in the commit
message to avoid any ambiguities.

[1] 521223d8b3ec ("cpufreq: Fix initialization of min and max frequency 
QoS requests")


>
>>           if (ret < 0)
>>               goto out_destroy_policy;
>>             ret = freq_qos_add_request(&policy->constraints,
>>                          &policy->max_freq_req, FREQ_QOS_MAX,
>> -                       FREQ_QOS_MAX_DEFAULT_VALUE);
>> +                       max);
>>           if (ret < 0)
>>               goto out_destroy_policy;
>> blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>>                   CPUFREQ_CREATE_POLICY, policy);
>> +
>> +        /*
>> +         * If the driver didn't set QoS constraints, policy->min/max 
>> still
>> +         * need to be set as they are used to clamp frequency requests.
>> +         */
>> +        policy->min = policy->min ? policy->min : 
>> policy->cpuinfo.min_freq;
>> +        policy->max = policy->max ? policy->max : 
>> policy->cpuinfo.max_freq;
>
>
> Does it make sense to set policy->min / policy->max before the
> CPUFREQ_CREATE_POLICY notifier, since some drivers may use them in their
> callbacks?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/core.rst#n58 
>
>
>
Yes right indeed, this would be better.

Thanks for the review


      reply	other threads:[~2026-04-30 13:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23  8:47 [PATCH 0/1] cpufreq: Set policy->min and max as real QoS constraints Pierre Gondois
2026-04-23  8:47 ` [PATCH 1/1] " Pierre Gondois
2026-04-27  3:08   ` Jie Zhan
2026-04-30 13:41     ` Pierre Gondois
2026-04-28 16:37   ` Sumit Gupta
2026-04-30 13:41     ` Pierre Gondois
2026-04-29 13:00   ` Zhongqiu Han
2026-04-30 13:41     ` Pierre Gondois [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3efe318a-52bf-4c92-8b86-03e0bb6a9a93@arm.com \
    --to=pierre.gondois@arm.com \
    --cc=ionela.voinescu@arm.com \
    --cc=kprateek.nayak@amd.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=perry.yuan@amd.com \
    --cc=rafael@kernel.org \
    --cc=ray.huang@amd.com \
    --cc=saravanak@kernel.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=sumitg@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    --cc=zhongqiu.han@oss.qualcomm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox