public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
To: Pierre Gondois <pierre.gondois@arm.com>, linux-kernel@vger.kernel.org
Cc: Jie Zhan <zhanjie9@hisilicon.com>,
	Lifeng Zheng <zhenglifeng1@huawei.com>,
	Ionela Voinescu <ionela.voinescu@arm.com>,
	Sumit Gupta <sumitg@nvidia.com>, Huang Rui <ray.huang@amd.com>,
	Mario Limonciello <mario.limonciello@amd.com>,
	Perry Yuan <perry.yuan@amd.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Len Brown <lenb@kernel.org>,
	Saravana Kannan <saravanak@kernel.org>,
	linux-pm@vger.kernel.org, zhongqiu.han@oss.qualcomm.com
Subject: Re: [PATCH 1/1] cpufreq: Set policy->min and max as real QoS constraints
Date: Wed, 29 Apr 2026 21:00:55 +0800	[thread overview]
Message-ID: <73fac9ca-451d-49f0-b9c7-5ef6bc0119bf@oss.qualcomm.com> (raw)
In-Reply-To: <20260423084731.1090384-2-pierre.gondois@arm.com>

On 4/23/2026 4:47 PM, Pierre Gondois wrote:
> cpufreq_set_policy() will ultimately override the policy min/max
> values written in the .init() callback through:
> cpufreq_policy_online()
> \-cpufreq_init_policy()
>    \-cpufreq_set_policy()
>      \-/* Set policy->min/max */
> Thus the policy min/max values provided are only temporary.
> 
> There is an exception if CPUFREQ_NEED_INITIAL_FREQ_CHECK is set and:
> cpufreq_policy_online()
> \-cpufreq_init_policy()
>    \-__cpufreq_driver_target()
>      \-cpufreq_driver->target()
> is called. To avoid any regression, set policy->min/max in cpufreq.c
> if the values were not initialized.
> 
> In this patch:
> - Setting policy->min or max value in driver .init() cb is
>    interpreted as setting a QoS constraint.
> - Remove policy->min/max initialization in drivers if the values
>    are similar to policy->cpuinfo.min_freq/max_freq.
>    The only drivers where these values are different are:
>    - gx-suspmod.c
>    - cppc-cpufreq.c
>    - longrun.c
> - For the cppc-cpufreq driver, the lowest non-linear freq. is
>    used as a min QoS constraint as suggested at:
>    https://lore.kernel.org/lkml/20260213100633.15413-1-zhangpengjie2@huawei.com/
> 
> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>


Hi Pierre,
Thanks for the patch. I have a few additional inline comments/questions
below.


> ---
>   drivers/cpufreq/amd-pstate.c      | 16 ++++++++--------
>   drivers/cpufreq/cppc_cpufreq.c    | 11 +++++++----
>   drivers/cpufreq/cpufreq-nforce2.c |  4 ++--
>   drivers/cpufreq/cpufreq.c         | 19 +++++++++++++++++--
>   drivers/cpufreq/freq_table.c      |  7 +++----
>   drivers/cpufreq/gx-suspmod.c      |  9 +++++----
>   drivers/cpufreq/intel_pstate.c    |  3 ---
>   drivers/cpufreq/pcc-cpufreq.c     |  8 ++++----
>   drivers/cpufreq/pxa3xx-cpufreq.c  |  4 ++--
>   drivers/cpufreq/sh-cpufreq.c      |  4 ++--
>   drivers/cpufreq/virtual-cpufreq.c |  5 +----
>   11 files changed, 51 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 453084c67327f..1ed4bcdcc957f 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -1090,10 +1090,10 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>   
>   	perf = READ_ONCE(cpudata->perf);
>   
> -	policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
> -							      cpudata->nominal_freq,
> -							      perf.lowest_perf);
> -	policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
> +	policy->cpuinfo.min_freq = perf_to_freq(perf,
> +						cpudata->nominal_freq,
> +						perf.lowest_perf);
> +	policy->cpuinfo.max_freq = cpudata->max_freq;


It is better to update doc as well to avoid new dirver developmenter set
policy->min / policy->max again?

https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/cpu-drivers.rst#n102

>   
>   	policy->driver_data = cpudata;
>   	ret = amd_pstate_cppc_enable(policy);
> @@ -1907,10 +1907,10 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>   
>   	perf = READ_ONCE(cpudata->perf);
>   
> -	policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
> -							      cpudata->nominal_freq,
> -							      perf.lowest_perf);
> -	policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
> +	policy->cpuinfo.min_freq = perf_to_freq(perf,
> +						cpudata->nominal_freq,
> +						perf.lowest_perf);
> +	policy->cpuinfo.max_freq = cpudata->max_freq;
>   	policy->driver_data = cpudata;
>   
>   	ret = amd_pstate_cppc_enable(policy);
> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
> index 7e7f9dfb7a24c..c6fcecdbbab0c 100644
> --- a/drivers/cpufreq/cppc_cpufreq.c
> +++ b/drivers/cpufreq/cppc_cpufreq.c
> @@ -645,6 +645,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>   	unsigned int cpu = policy->cpu;
>   	struct cppc_cpudata *cpu_data;
>   	struct cppc_perf_caps *caps;
> +	unsigned int min, max;
>   	int ret;
>   
>   	cpu_data = cppc_cpufreq_get_cpu_data(cpu);
> @@ -655,13 +656,15 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>   	caps = &cpu_data->perf_caps;
>   	policy->driver_data = cpu_data;
>   
> +	min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
> +	max = cppc_perf_to_khz(caps, policy->boost_enabled ?
> +			       caps->highest_perf : caps->nominal_perf);
> +
>   	/*
>   	 * Set min to lowest nonlinear perf to avoid any efficiency penalty (see
>   	 * Section 8.4.7.1.1.5 of ACPI 6.1 spec)
>   	 */
> -	policy->min = cppc_perf_to_khz(caps, caps->lowest_nonlinear_perf);
> -	policy->max = cppc_perf_to_khz(caps, policy->boost_enabled ?
> -						caps->highest_perf : caps->nominal_perf);
> +	policy->min = min;
>   
>   	/*
>   	 * Set cpuinfo.min_freq to Lowest to make the full range of performance
> @@ -669,7 +672,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>   	 * nonlinear perf
>   	 */
>   	policy->cpuinfo.min_freq = cppc_perf_to_khz(caps, caps->lowest_perf);
> -	policy->cpuinfo.max_freq = policy->max;
> +	policy->cpuinfo.max_freq = max;
>   
>   	policy->transition_delay_us = cppc_cpufreq_get_transition_delay_us(cpu);
>   	policy->shared_type = cpu_data->shared_type;
> diff --git a/drivers/cpufreq/cpufreq-nforce2.c b/drivers/cpufreq/cpufreq-nforce2.c
> index fbbbe501cf2dc..831102522ad64 100644
> --- a/drivers/cpufreq/cpufreq-nforce2.c
> +++ b/drivers/cpufreq/cpufreq-nforce2.c
> @@ -355,8 +355,8 @@ static int nforce2_cpu_init(struct cpufreq_policy *policy)
>   		min_fsb = NFORCE2_MIN_FSB;
>   
>   	/* cpuinfo and default policy values */
> -	policy->min = policy->cpuinfo.min_freq = min_fsb * fid * 100;
> -	policy->max = policy->cpuinfo.max_freq = max_fsb * fid * 100;
> +	policy->cpuinfo.min_freq = min_fsb * fid * 100;
> +	policy->cpuinfo.max_freq = max_fsb * fid * 100;
>   
>   	return 0;
>   }
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 44eb1b7e7fc1b..b30bfa3e27daa 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1453,6 +1453,14 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>   	cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>   
>   	if (new_policy) {
> +		unsigned int min, max;
> +
> +		/* Use policy->min/max set by the driver as QoS requests. */
> +		min = max(FREQ_QOS_MIN_DEFAULT_VALUE, policy->min);
> +		if (policy->max)
> +			max = min(FREQ_QOS_MAX_DEFAULT_VALUE, policy->max);
> +		else
> +			max = FREQ_QOS_MAX_DEFAULT_VALUE;


Nit: Using local variables named min/max is confusing here since they
shadow the common min()/max() macros; renaming them (e.g. min_freq
/ max_freq) would improve readability and maintainability.


>   		for_each_cpu(j, policy->related_cpus) {
>   			per_cpu(cpufreq_cpu_data, j) = policy;
>   			add_cpu_dev_symlink(policy, j, get_cpu_device(j));
> @@ -1469,18 +1477,25 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>   
>   		ret = freq_qos_add_request(&policy->constraints,
>   					   &policy->min_freq_req, FREQ_QOS_MIN,
> -					   FREQ_QOS_MIN_DEFAULT_VALUE);
> +					   min);

It seems that the current patch is not merely a superficial cleanup; it
also changes the policy->min value in the GX driver, setting it to the
5% value expected by the driver. If so, we should document it in the
commit message.

https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/
tree/drivers/cpufreq/gx-suspmod.c#n137

/* gx-suspmod.c constants:
  *   POLICY_MIN_DIV = 20
  *   max_duration   = 255
  *   maxfreq = e.g. 300000 kHz (300 MHz)
  */

cpufreq_policy_online()
   cpufreq_driver->init(policy)      /* cpufreq_gx_cpu_init() */
     policy->min = maxfreq/20        /* 15000 kHz, 5% */
     cpuinfo.min_freq = maxfreq/255  /* 1176 kHz, 0.39% */

   /* Before current patch: 0, not policy->min */
   freq_qos_add_request(..., FREQ_QOS_MIN, 0)

   cpufreq_init_policy()
     cpufreq_set_policy()
       /* reads QoS=0, discards init()'s 15000 */
       new_data.min = freq_qos_read_value(FREQ_QOS_MIN)
       cpufreq_gx_verify()
         cpufreq_verify_within_cpu_limits()
           /* 0 < 1176: clamp to hw floor */
           new_data.min = cpuinfo.min_freq  /* 1176 kHz */
       WRITE_ONCE(policy->min, 1176)  /* 0.39%, not 5% */

After current patch:
freq_qos_add_request(..., FREQ_QOS_MIN, policy->min)
   => new_data.min stays 15000, no clamping, policy->min = 15000


>   		if (ret < 0)
>   			goto out_destroy_policy;
>   
>   		ret = freq_qos_add_request(&policy->constraints,
>   					   &policy->max_freq_req, FREQ_QOS_MAX,
> -					   FREQ_QOS_MAX_DEFAULT_VALUE);
> +					   max);
>   		if (ret < 0)
>   			goto out_destroy_policy;
>   
>   		blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
>   				CPUFREQ_CREATE_POLICY, policy);
> +
> +		/*
> +		 * If the driver didn't set QoS constraints, policy->min/max still
> +		 * need to be set as they are used to clamp frequency requests.
> +		 */
> +		policy->min = policy->min ? policy->min : policy->cpuinfo.min_freq;
> +		policy->max = policy->max ? policy->max : policy->cpuinfo.max_freq;


Does it make sense to set policy->min / policy->max before the
CPUFREQ_CREATE_POLICY notifier, since some drivers may use them in their
callbacks?

https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git/tree/Documentation/cpu-freq/core.rst#n58


>   	}
>   
>   	if (cpufreq_driver->get && has_target()) {
> diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c
> index 5b364d8da4f92..ea994647abc88 100644
> --- a/drivers/cpufreq/freq_table.c
> +++ b/drivers/cpufreq/freq_table.c
> @@ -49,16 +49,15 @@ int cpufreq_frequency_table_cpuinfo(struct cpufreq_policy *policy)
>   			max_freq = freq;
>   	}
>   
> -	policy->min = policy->cpuinfo.min_freq = min_freq;
> -	policy->max = max_freq;
> +	policy->cpuinfo.min_freq = min_freq;
>   	/*
>   	 * If the driver has set its own cpuinfo.max_freq above max_freq, leave
>   	 * it as is.
>   	 */
>   	if (policy->cpuinfo.max_freq < max_freq)
> -		policy->max = policy->cpuinfo.max_freq = max_freq;
> +		policy->cpuinfo.max_freq = max_freq;
>   
> -	if (policy->min == ~0)
> +	if (min_freq == ~0)
>   		return -EINVAL;
>   	else
>   		return 0;
> diff --git a/drivers/cpufreq/gx-suspmod.c b/drivers/cpufreq/gx-suspmod.c
> index d269a4f26f98e..ebda2bbebf44b 100644
> --- a/drivers/cpufreq/gx-suspmod.c
> +++ b/drivers/cpufreq/gx-suspmod.c
> @@ -397,7 +397,7 @@ static int cpufreq_gx_target(struct cpufreq_policy *policy,
>   
>   static int cpufreq_gx_cpu_init(struct cpufreq_policy *policy)
>   {
> -	unsigned int maxfreq;
> +	unsigned int minfreq, maxfreq;
>   
>   	if (!policy || policy->cpu != 0)
>   		return -ENODEV;
> @@ -418,10 +418,11 @@ static int cpufreq_gx_cpu_init(struct cpufreq_policy *policy)
>   	policy->cpu = 0;
>   
>   	if (max_duration < POLICY_MIN_DIV)
> -		policy->min = maxfreq / max_duration;
> +		minfreq = maxfreq / max_duration;
>   	else
> -		policy->min = maxfreq / POLICY_MIN_DIV;
> -	policy->max = maxfreq;
> +		minfreq = maxfreq / POLICY_MIN_DIV;
> +
> +	policy->min = minfreq;
>   	policy->cpuinfo.min_freq = maxfreq / max_duration;
>   	policy->cpuinfo.max_freq = maxfreq;
>   
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 1292da53e5fcb..68ccc6eb1ef30 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -3049,9 +3049,6 @@ static int __intel_pstate_cpu_init(struct cpufreq_policy *policy)
>   	policy->cpuinfo.max_freq = READ_ONCE(global.no_turbo) ?
>   			cpu->pstate.max_freq : cpu->pstate.turbo_freq;
>   
> -	policy->min = policy->cpuinfo.min_freq;
> -	policy->max = policy->cpuinfo.max_freq;
> -
>   	intel_pstate_init_acpi_perf_limits(policy);
>   
>   	policy->fast_switch_possible = true;
> diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c
> index ac2e90a65f0c4..231edfe8cabaa 100644
> --- a/drivers/cpufreq/pcc-cpufreq.c
> +++ b/drivers/cpufreq/pcc-cpufreq.c
> @@ -551,13 +551,13 @@ static int pcc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>   		goto out;
>   	}
>   
> -	policy->max = policy->cpuinfo.max_freq =
> +	policy->cpuinfo.max_freq =
>   		ioread32(&pcch_hdr->nominal) * 1000;
> -	policy->min = policy->cpuinfo.min_freq =
> +	policy->cpuinfo.min_freq =
>   		ioread32(&pcch_hdr->minimum_frequency) * 1000;
>   
> -	pr_debug("init: policy->max is %d, policy->min is %d\n",
> -		policy->max, policy->min);
> +	pr_debug("init: max_freq is %d, min_freq is %d\n",
> +		 policy->cpuinfo.max_freq, policy->cpuinfo.min_freq);
>   out:
>   	return result;
>   }
> diff --git a/drivers/cpufreq/pxa3xx-cpufreq.c b/drivers/cpufreq/pxa3xx-cpufreq.c
> index 50ff3b6a69000..181962b0924e6 100644
> --- a/drivers/cpufreq/pxa3xx-cpufreq.c
> +++ b/drivers/cpufreq/pxa3xx-cpufreq.c
> @@ -185,8 +185,8 @@ static int pxa3xx_cpufreq_init(struct cpufreq_policy *policy)
>   	int ret = -EINVAL;
>   
>   	/* set default policy and cpuinfo */
> -	policy->min = policy->cpuinfo.min_freq = 104000;
> -	policy->max = policy->cpuinfo.max_freq =
> +	policy->cpuinfo.min_freq = 104000;
> +	policy->cpuinfo.max_freq =
>   		(cpu_is_pxa320()) ? 806000 : 624000;
>   	policy->cpuinfo.transition_latency = 1000; /* FIXME: 1 ms, assumed */
>   
> diff --git a/drivers/cpufreq/sh-cpufreq.c b/drivers/cpufreq/sh-cpufreq.c
> index 642ddb9ea217e..244153a1cead2 100644
> --- a/drivers/cpufreq/sh-cpufreq.c
> +++ b/drivers/cpufreq/sh-cpufreq.c
> @@ -124,9 +124,9 @@ static int sh_cpufreq_cpu_init(struct cpufreq_policy *policy)
>   		dev_notice(dev, "no frequency table found, falling back "
>   			   "to rate rounding.\n");
>   
> -		policy->min = policy->cpuinfo.min_freq =
> +		policy->cpuinfo.min_freq =
>   			(clk_round_rate(cpuclk, 1) + 500) / 1000;
> -		policy->max = policy->cpuinfo.max_freq =
> +		policy->cpuinfo.max_freq =
>   			(clk_round_rate(cpuclk, ~0UL) + 500) / 1000;
>   	}
>   
> diff --git a/drivers/cpufreq/virtual-cpufreq.c b/drivers/cpufreq/virtual-cpufreq.c
> index 4159f31349b16..dc78b74409af4 100644
> --- a/drivers/cpufreq/virtual-cpufreq.c
> +++ b/drivers/cpufreq/virtual-cpufreq.c
> @@ -164,10 +164,7 @@ static int virt_cpufreq_get_freq_info(struct cpufreq_policy *policy)
>   		policy->cpuinfo.min_freq = 1;
>   		policy->cpuinfo.max_freq = virt_cpufreq_get_perftbl_entry(policy->cpu, 0);
>   
> -		policy->min = policy->cpuinfo.min_freq;
> -		policy->max = policy->cpuinfo.max_freq;
> -
> -		policy->cur = policy->max;
> +		policy->cur = policy->cpuinfo.max_freq;
>   		return 0;
>   	}
>   


-- 
Thx and BRs,
Zhongqiu Han

  parent reply	other threads:[~2026-04-29 13:01 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-23  8:47 [PATCH 0/1] cpufreq: Set policy->min and max as real QoS constraints Pierre Gondois
2026-04-23  8:47 ` [PATCH 1/1] " Pierre Gondois
2026-04-27  3:08   ` Jie Zhan
2026-04-30 13:41     ` Pierre Gondois
2026-04-28 16:37   ` Sumit Gupta
2026-04-30 13:41     ` Pierre Gondois
2026-04-29 13:00   ` Zhongqiu Han [this message]
2026-04-30 13:41     ` Pierre Gondois

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73fac9ca-451d-49f0-b9c7-5ef6bc0119bf@oss.qualcomm.com \
    --to=zhongqiu.han@oss.qualcomm.com \
    --cc=ionela.voinescu@arm.com \
    --cc=kprateek.nayak@amd.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=perry.yuan@amd.com \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=ray.huang@amd.com \
    --cc=saravanak@kernel.org \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=sumitg@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox