From: Sumit Gupta <sumitg@nvidia.com>
To: "zhenglifeng (A)" <zhenglifeng1@huawei.com>,
Pierre Gondois <pierre.gondois@arm.com>,
rafael@kernel.org, viresh.kumar@linaro.org
Cc: linux-tegra@vger.kernel.org, linux-pm@vger.kernel.org,
ray.huang@amd.com, corbet@lwn.net, robert.moore@intel.com,
lenb@kernel.org, acpica-devel@lists.linux.dev,
mario.limonciello@amd.com, rdunlap@infradead.org,
linux-kernel@vger.kernel.org, gautham.shenoy@amd.com,
zhanjie9@hisilicon.com, ionela.voinescu@arm.com,
perry.yuan@amd.com, linux-doc@vger.kernel.org,
linux-acpi@vger.kernel.org, treding@nvidia.com,
jonathanh@nvidia.com, vsethi@nvidia.com, ksitaraman@nvidia.com,
sanjayc@nvidia.com, nhartman@nvidia.com, bbasu@nvidia.com,
sumitg@nvidia.com
Subject: Re: [PATCH v5 10/11] cpufreq: CPPC: make scaling_min/max_freq read-only when auto_sel enabled
Date: Thu, 15 Jan 2026 20:52:25 +0530 [thread overview]
Message-ID: <0d1a10e8-a8d5-4d27-bd16-0443d5408ca6@nvidia.com> (raw)
In-Reply-To: <27750fe9-8b0e-4687-bc5f-21e4ec38bf66@huawei.com>
>>> n 08/01/26 22:16, Pierre Gondois wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> Hello Sumit, Lifeng,
>>>>
>>>> On 12/23/25 13:13, Sumit Gupta wrote:
>>>>> When autonomous selection (auto_sel) is enabled, the hardware controls
>>>>> performance within min_perf/max_perf register bounds making the
>>>>> scaling_min/max_freq effectively read-only.
>>>> If auto_sel is set, the governor associated to the policy will have no
>>>> actual control.
>>>>
>>>> E.g.:
>>>> If the schedutil governor is used, attempts to set the
>>>> frequency based on CPU utilization will be periodically
>>>> sent, but they will have no effect.
>>>>
>>>> The same thing will happen for the ondemand, performance,
>>>> powersave, userspace, etc. governors. They can only work if
>>>> frequency requests are taken into account.
>>>>
>>>> ------------
>>>>
>>>> This looks like the intel_pstate governor handling where it is possible
>>>> not to have .target() or .target_index() callback and the hardware is in
>>>> charge (IIUC).
>>>> For this case, only 2 governor seem available: performance and powersave.
>>>>
> As you mentioned in [2], 'it still makes sense to have cpufreq requesting a
> certain performance level even though autonomous selection is enabled'. So I
> think it's OK to have a governor when auto_selection is enabled.
>
> [2] https://lore.kernel.org/all/9f46991d-98c3-41f5-8133-6612b397e33a@arm.com/
>
>> Thanks for pointing me to the first version, I forgot how your
>> first implementation was.
>>
>>
>>> In v1 [1], I added a separate cppc_cpufreq_epp_driver instance without
>>> target*() hooks, using setpolicy() instead (similar to AMD pstate).
>>> However, this approach doesn't allow per-CPU control: if we boot with the
>>> EPP driver, we can't dynamically disable auto_sel for individual CPUs and
>>> return to OS governor control (no target hook available). AMD and Intel
>>> pstate drivers seem to set HW autonomous mode for all CPUs globally,
>>> not per-CPU. So, changed it in v2.
>>> [1] https://lore.kernel.org/lkml/20250211103737.447704-6-sumitg@nvidia.com/
>>>
>> Ok right.
>> This is something I don't really understand in the current intel/amd cpufreq
>> drivers. FWIU:
>> - the cpufreq drivers abstractions allow to access different hardware
>> - the governor abstraction allows to switch between different algorithms
>> to select the 'correct' frequency.
>>
>> So IMO switching to autonomous selection should be done by switching
>> to another governor and the 'auto_sel' file should not be accessible to users.
>>
>> ------------
>>
>> Being able to enable/disable the autonomous selection on a per-policy
>> base seems a valid use-case. It also seems to fit the per-policy governor
>> capabilities.
> I'm OK with adding an auto-selection governor. It's better to keep this
> governor only in cppc_cpufreq for now I think.
>
>> However toggling the auto_sel on different CPUs inside the same policy
>> seems inappropriate (this is is not what is done in this patchset IIUC).
>>
> I think Sumit means per-policy when he said per-CPU.
Yes, it's per-policy.
Thank you,
Sumit Gupta
>>>> ------------
>>>>
>>>> In our case, I think it is desired to unload the scaling governor
>>>> currently in
>>>> use if auto_sel is selected. Letting the rest of the system think it has
>>>> control
>>>> over the freq. selection seems incorrect.
>>>> I am not sure what to replace it with:
>>>> -
>>>> There are no specific performance/powersave modes for CPPC.
>>>> There is a range of values between 0-255
>>>> -
>>>> A firmware auto-selection governor could be created just for this case.
>>>> Being able to switch between OS-driven and firmware driven freq. selection
>>>> is not specific to CPPC (for the future).
>>>> However I am not really able to say the implications of doing that.
>>>>
>>>> ------------
>>>>
>>>> I think it would be better to split your patchset in 2:
>>>> 1. adding APIs for the CPPC spec.
>>>> 2. using the APIs, especially for auto_sel
>>>>
>>>> 1. is likely to be straightforward as the APIs will still be used
>>>> by the driver at some point.
>>>> 2. is likely to bring more discussion.
>>>>
>>> We discussed adding a hw_auto_sel governor as a second step, though the
>>> approach may need refinement during implementation.
>> I didn't find in the thread adding a new governor was discussed in the
>> threads, in case you have a direct link.
>>
>>> Deferred it (to second step) because adding a new governor requires
>>> broader discussion.
>>>
>>> This issue already exists in current code - store_auto_select() enables
>>> auto_sel without any governor awareness. These patches improve the
>>> situation by:
>>> - Updating scaling_min/max_freq when toggling auto_sel mode
>>> - Syncing policy limits with actual HW min/max_perf bounds
>>> - Making scaling_min/max_freq read-only in auto_sel mode
>>>
>>> Would it be acceptable to merge this as a first step, with the governor
>>> handling as a follow-up?
>>> If not and you prefer splitting, which grouping works better:
>>> A) Patches 1-8 then 9-11.
>>> B) "ACPI: CPPC *" patches then "cpufreq: CPPC *" patches.
>>>
>> If it's possible I would like to understand what the end result should
>> look like. If ultimately enabling auto_sel implies switching governor
>> I understand, but I didn't find the thread that discussed about that
>> unfortunately.
>>
>>
>>>>> Enforce this by setting policy limits to min/max_perf bounds in
>>>>> cppc_verify_policy(). Users must use min_perf/max_perf sysfs interfaces
>>>>> to change performance limits in autonomous mode.
>>>>>
>>>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
>>>>> ---
>>>>> drivers/cpufreq/cppc_cpufreq.c | 32 +++++++++++++++++++++++++++++++-
>>>>> 1 file changed, 31 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>>>>> index b1f570d6de34..b3da263c18b0 100644
>>>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>>>> @@ -305,7 +305,37 @@ static unsigned int cppc_cpufreq_fast_switch(struct cpufreq_policy *policy,
>>>>>
>>>>> static int cppc_verify_policy(struct cpufreq_policy_data *policy)
>>>>> {
>>>>> - cpufreq_verify_within_cpu_limits(policy);
>>>>> + unsigned int min_freq = policy->cpuinfo.min_freq;
>>>>> + unsigned int max_freq = policy->cpuinfo.max_freq;
>>>>> + struct cpufreq_policy *cpu_policy;
>>>>> + struct cppc_cpudata *cpu_data;
>>>>> + struct cppc_perf_caps *caps;
>>>>> +
>>>>> + cpu_policy = cpufreq_cpu_get(policy->cpu);
>>>>> + if (!cpu_policy)
>>>>> + return -ENODEV;
>>>>> +
>>>>> + cpu_data = cpu_policy->driver_data;
>>>>> + caps = &cpu_data->perf_caps;
>>>>> +
>>>>> + if (cpu_data->perf_ctrls.auto_sel) {
>>>>> + u32 min_perf, max_perf;
>>>>> +
>>>>> + /*
>>>>> + * Set policy limits to HW min/max_perf bounds. In autonomous
>>>>> + * mode, scaling_min/max_freq is effectively read-only.
>>>>> + */
>>>>> + min_perf = cpu_data->perf_ctrls.min_perf ?:
>>>>> + caps->lowest_nonlinear_perf;
>>>>> + max_perf = cpu_data->perf_ctrls.max_perf ?: caps->nominal_perf;
>>>>> +
>>>>> + policy->min = cppc_perf_to_khz(caps, min_perf);
>>>>> + policy->max = cppc_perf_to_khz(caps, max_perf);
>>>> policy->min/max values are overwritten, but the governor which is
>>>> supposed to use them to select the most fitting frequency will be
>>>> ignored by the firmware I think.
>>>>
>>> Yes.
>>>
>>>>> + } else {
>>>>> + cpufreq_verify_within_limits(policy, min_freq, max_freq);
>>>>> + }
>>>>> +
>>>>> + cpufreq_cpu_put(cpu_policy);
>>>>> return 0;
>>>>> }
>>>>>
next prev parent reply other threads:[~2026-01-15 15:22 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-23 12:12 [PATCH v5 00/11] Enhanced autonomous selection and improvements Sumit Gupta
2025-12-23 12:12 ` [PATCH v5 01/11] cpufreq: CPPC: Add generic helpers for sysfs show/store Sumit Gupta
2025-12-25 3:41 ` zhenglifeng (A)
2026-01-08 13:31 ` Sumit Gupta
2025-12-23 12:12 ` [PATCH v5 02/11] ACPI: CPPC: Clean up cppc_perf_caps and cppc_perf_ctrls structs Sumit Gupta
2026-01-08 13:43 ` Pierre Gondois
2025-12-23 12:12 ` [PATCH v5 03/11] ACPI: CPPC: Add cppc_get_perf() API to read performance controls Sumit Gupta
2025-12-25 8:21 ` zhenglifeng (A)
2026-01-08 13:36 ` Sumit Gupta
2025-12-23 12:13 ` [PATCH v5 04/11] ACPI: CPPC: Extend cppc_set_epp_perf() to support auto_sel and epp Sumit Gupta
2025-12-25 3:56 ` zhenglifeng (A)
2026-01-08 13:39 ` Sumit Gupta
2026-01-16 15:59 ` Pierre Gondois
2025-12-23 12:13 ` [PATCH v5 05/11] ACPI: CPPC: add APIs and sysfs interface for min/max_perf Sumit Gupta
2025-12-25 9:03 ` zhenglifeng (A)
2025-12-23 12:13 ` [PATCH v5 06/11] ACPI: CPPC: add APIs and sysfs interface for perf_limited Sumit Gupta
2025-12-25 12:06 ` zhenglifeng (A)
2026-01-08 14:38 ` Sumit Gupta
2026-01-15 8:01 ` zhenglifeng (A)
2025-12-23 12:13 ` [PATCH v5 07/11] cpufreq: CPPC: Add sysfs for min/max_perf and perf_limited Sumit Gupta
2025-12-24 18:32 ` kernel test robot
2025-12-26 0:20 ` Bagas Sanjaya
2026-01-08 14:30 ` Sumit Gupta
2025-12-23 12:13 ` [PATCH v5 08/11] cpufreq: CPPC: sync policy limits when updating min/max_perf Sumit Gupta
2025-12-25 13:56 ` zhenglifeng (A)
2026-01-08 13:53 ` Sumit Gupta
2026-01-15 8:20 ` zhenglifeng (A)
2025-12-23 12:13 ` [PATCH v5 09/11] cpufreq: CPPC: sync policy limits when toggling auto_select Sumit Gupta
2025-12-26 2:55 ` zhenglifeng (A)
2026-01-08 14:21 ` Sumit Gupta
2026-01-15 8:57 ` zhenglifeng (A)
2025-12-23 12:13 ` [PATCH v5 10/11] cpufreq: CPPC: make scaling_min/max_freq read-only when auto_sel enabled Sumit Gupta
2025-12-26 3:26 ` zhenglifeng (A)
2026-01-08 14:01 ` Sumit Gupta
2026-01-08 16:46 ` Pierre Gondois
2026-01-09 14:37 ` Sumit Gupta
2026-01-12 11:44 ` Pierre Gondois
2026-01-15 12:32 ` zhenglifeng (A)
2026-01-15 15:22 ` Sumit Gupta [this message]
2026-01-16 17:05 ` Pierre Gondois
2026-01-15 15:15 ` Sumit Gupta
2025-12-23 12:13 ` [PATCH v5 11/11] cpufreq: CPPC: add autonomous mode boot parameter support Sumit Gupta
2025-12-26 8:03 ` zhenglifeng (A)
2026-01-08 14:04 ` Sumit Gupta
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0d1a10e8-a8d5-4d27-bd16-0443d5408ca6@nvidia.com \
--to=sumitg@nvidia.com \
--cc=acpica-devel@lists.linux.dev \
--cc=bbasu@nvidia.com \
--cc=corbet@lwn.net \
--cc=gautham.shenoy@amd.com \
--cc=ionela.voinescu@arm.com \
--cc=jonathanh@nvidia.com \
--cc=ksitaraman@nvidia.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=linux-tegra@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=nhartman@nvidia.com \
--cc=perry.yuan@amd.com \
--cc=pierre.gondois@arm.com \
--cc=rafael@kernel.org \
--cc=ray.huang@amd.com \
--cc=rdunlap@infradead.org \
--cc=robert.moore@intel.com \
--cc=sanjayc@nvidia.com \
--cc=treding@nvidia.com \
--cc=viresh.kumar@linaro.org \
--cc=vsethi@nvidia.com \
--cc=zhanjie9@hisilicon.com \
--cc=zhenglifeng1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox