public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sumit Gupta <sumitg@nvidia.com>
To: Pierre Gondois <pierre.gondois@arm.com>
Cc: "linux-tegra@vger.kernel.org" <linux-tegra@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"zhenglifeng1@huawei.com" <zhenglifeng1@huawei.com>,
	Thierry Reding <treding@nvidia.com>,
	"viresh.kumar@linaro.org" <viresh.kumar@linaro.org>,
	Jon Hunter <jonathanh@nvidia.com>,
	Vikram Sethi <vsethi@nvidia.com>,
	"ionela.voinescu@arm.com" <ionela.voinescu@arm.com>,
	Krishna Sitaraman <ksitaraman@nvidia.com>,
	Sanjay Chandrashekara <sanjayc@nvidia.com>,
	"zhanjie9@hisilicon.com" <zhanjie9@hisilicon.com>,
	"corbet@lwn.net" <corbet@lwn.net>, Matt Ochs <mochs@nvidia.com>,
	"skhan@linuxfoundation.org" <skhan@linuxfoundation.org>,
	Bibek Basu <bbasu@nvidia.com>,
	"rdunlap@infradead.org" <rdunlap@infradead.org>,
	"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
	"mario.limonciello@amd.com" <mario.limonciello@amd.com>,
	"rafael@kernel.org" <rafael@kernel.org>,
	sumitg@nvidia.com
Subject: Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support
Date: Fri, 24 Apr 2026 19:22:20 +0530	[thread overview]
Message-ID: <4e3cba64-2bd5-4789-b118-95a7c980731d@nvidia.com> (raw)
In-Reply-To: <aeb16dd2-0eb5-4fba-9b45-b5ef483ab7b4@arm.com>


On 24/04/26 18:25, Pierre Gondois wrote:
> External email: Use caution opening links or attachments
>
>
> On 4/24/26 14:10, Sumit Gupta wrote:
>>
>> On 20/04/26 18:37, Sumit Gupta wrote:
>>>
>>>>>> On 3/17/26 16:10, Sumit Gupta wrote:
>>>>>>> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable
>>>>>>> CPPC
>>>>>>> autonomous performance selection on all CPUs at system startup
>>>>>>> without
>>>>>>> requiring runtime sysfs manipulation. When autonomous mode is
>>>>>>> enabled,
>>>>>>> the hardware automatically adjusts CPU performance based on 
>>>>>>> workload
>>>>>>> demands using Energy Performance Preference (EPP) hints.
>>>>>>>
>>>>>>> When auto_sel_mode=1:
>>>>>>> - Configure all CPUs for autonomous operation on first init
>>>>>>> - Set EPP to performance preference (0x0)
>>>>>>> - Use HW min/max when set; otherwise program from policy limits
>>>>>>> (caps)
>>>>>>> - Clamp desired_perf to bounds before enabling autonomous mode
>>>>>>> - Hardware controls frequency instead of the OS governor
>>>>>>>
>>>>>>> The boot parameter is applied only during first policy
>>>>>>> initialization.
>>>>>>> On hotplug, skip applying it so that the user's runtime sysfs
>>>>>>> configuration is preserved.
>>>>>>>
>>>>>>> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation)
>>>>>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
>>>>>>> ---
>>>>>>> Part 1 [1] of this series was applied for 7.1 and present in next.
>>>>>>> Sending this patch as reworked version of 'patch 11' from [2] based
>>>>>>> on next.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ 
>>>>>>>
>>>>>>>
>>>>>>> [2]
>>>>>>> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ 
>>>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>    .../admin-guide/kernel-parameters.txt         | 13 +++
>>>>>>>    drivers/cpufreq/cppc_cpufreq.c                | 84
>>>>>>> +++++++++++++++++--
>>>>>>>    2 files changed, 92 insertions(+), 5 deletions(-)
>>>>>>>
>>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>>>>>>> b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>> index fa6171b5fdd5..de4b4c89edfe 100644
>>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>> @@ -1060,6 +1060,19 @@ Kernel parameters
>>>>>>>                        policy to use. This governor must be
>>>>>>> registered in the
>>>>>>>                        kernel before the cpufreq driver probes.
>>>>>>>
>>>>>>> +     cppc_cpufreq.auto_sel_mode=
>>>>>>> +                     [CPU_FREQ] Enable ACPI CPPC autonomous
>>>>>>> performance
>>>>>>> +                     selection. When enabled, hardware
>>>>>>> automatically adjusts
>>>>>>> +                     CPU frequency on all CPUs based on workload
>>>>>>> demands.
>>>>>>> +                     In Autonomous mode, Energy Performance
>>>>>>> Preference (EPP)
>>>>>>> +                     hints guide hardware toward performance (0x0)
>>>>>>> or energy
>>>>>>> +                     efficiency (0xff).
>>>>>>> +                     Requires ACPI CPPC autonomous selection
>>>>>>> register support.
>>>>>>> +                     Format: <bool>
>>>>>>> +                     Default: 0 (disabled)
>>>>>>> +                     0: use cpufreq governors
>>>>>>> +                     1: enable if supported by hardware
>>>>>>> +
>>>>>>>        cpu_init_udelay=N
>>>>>>>                        [X86,EARLY] Delay for N microsec between
>>>>>>> assert and de-assert
>>>>>>>                        of APIC INIT to start processors. This delay
>>>>>>> occurs
>>>>>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c
>>>>>>> b/drivers/cpufreq/cppc_cpufreq.c
>>>>>>> index 5dfb109cf1f4..49c148b2a0a4 100644
>>>>>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>>>>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>>>>>> @@ -28,6 +28,9 @@
>>>>>>>
>>>>>>>    static struct cpufreq_driver cppc_cpufreq_driver;
>>>>>>>
>>>>>>> +/* Autonomous Selection boot parameter */
>>>>>>> +static bool auto_sel_mode;
>>>>>>> +
>>>>>>>    #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>>>>>>>    static enum {
>>>>>>>        FIE_UNSET = -1,
>>>>>>> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct
>>>>>>> cpufreq_policy *policy)
>>>>>>>        policy->cur = cppc_perf_to_khz(caps, caps->highest_perf);
>>>>>>>        cpu_data->perf_ctrls.desired_perf = caps->highest_perf;
>>>>>>>
>>>>>>> -     ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>>>>>> -     if (ret) {
>>>>>>> -             pr_debug("Err setting perf value:%d on CPU:%d.
>>>>>>> ret:%d\n",
>>>>>>> -                      caps->highest_perf, cpu, ret);
>>>>>>> -             goto out;
>>>>>>> +     /*
>>>>>>> +      * Enable autonomous mode on first init if boot param is set.
>>>>>>> +      * Check last_governor to detect first init and skip if
>>>>>>> auto_sel
>>>>>>> +      * is already enabled.
>>>>>>> +      */
>>>>>> If the goal is to set autosel only once at the driver init,
>>>>>> shouldn't this be done in cppc_cpufreq_init() ?
>>>>>> I understand that cpu_data doesn't exist yet in
>>>>>> cppc_cpufreq_init(), but this seems more appropriate to do
>>>>>> it there IMO.
>>>>>>
>>>>>> This means the cpudata should be updated accordingly
>>>>>> in this cppc_cpufreq_cpu_init() function.
>>>>> In an earlier version [1], the setup was in cppc_cpufreq_init() but
>>>>> was moved to cppc_cpufreq_cpu_init() to improve per-CPU error
>>>>> handling.
>>>>> Keeping the setup in cppc_cpufreq_init() helps to avoid the
>>>>> last_governor
>>>>> check. We can warn for a CPU failing to enable and continue so other
>>>>> CPUs keep autonomous mode.
>>>>> cppc_cpufreq_cpu_init() would then just check the auto_sel state
>>>>> from register and sync policy limits from min/max_perf registers when
>>>>> autonomous mode is active.
>>>>> Please let me know your thoughts.
>>>> FWIU the auto_sel_mode module parameter allows to
>>>> configure the default auto_sel_mode when the driver is
>>>> first loaded, so there should not need to check that again
>>>> whenever cppc_cpufreq_cpu_init() is called.
>>>> Maybe Ionela saw something we didn't see ?
>>>
>>> AFAIU, the concern in that review [1] was about error handling as the
>>> earlier version disabled auto_sel on all CPUs if any single CPU failed.
>>> Per-CPU error handling in cppc_cpufreq_init() (warn and continue)
>>> addresses that. Can't think of more reason.
>>> Do you have anything in mind?
>>>
>>
>> Actually, one case where cppc_cpufreq_cpu_init() would be needed
>> is when CPUs are offline at boot. So I will keep the setup in
>> cppc_cpufreq_cpu_init() in v2 same as present in current version.
>>
> Wouldn't it be possible to loop over the "cpu_present_mask"
> as you currently do in cppc_cpufreq_exit() ?

On ARM64 it works since registers go through PCC/SystemMemory
which don't require the target CPU online. But cppc_cpufreq.c
is also built for RISCV, where cpc_write_ffh() uses
smp_call_function_single().
So setup in cppc_cpufreq_init() with for_each_present_cpu() would
fail on RISCV+FFH platforms when CPUs are offline at boot.
cppc_cpufreq_cpu_init() handles all cases naturally.


>
> ------
>
> Another issue about relying on "cpu_data->perf_ctrls.auto_sel" in:
>
> """
> if (auto_sel_mode && policy->last_governor[0] == '\0' &&
>     !cpu_data->perf_ctrls.auto_sel) {
> """
>
> is that the cpu_data struct is fresh memory coming from
> cppc_cpufreq_get_cpu_data(), so it might always be 0
> I think.
>

cppc_cpufreq_get_cpu_data() calls cppc_get_perf() (added in [1])
which reads perf_ctrls including auto_sel from the HW register.
So, cpu_data->perf_ctrls.auto_sel reflects the actual HW state,
not the zeroed kzalloc value.

[1] https://lore.kernel.org/lkml/20260206142658.72583-2-sumitg@nvidia.com/

Thank you,
Sumit Gupta




      reply	other threads:[~2026-04-24 13:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-17 15:10 [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support Sumit Gupta
2026-03-24 18:18 ` Pierre Gondois
2026-04-06 18:08   ` Sumit Gupta
2026-04-10 13:47     ` Pierre Gondois
2026-04-13  5:51       ` Viresh Kumar
2026-04-20 13:13         ` Sumit Gupta
2026-04-20 13:07       ` Sumit Gupta
2026-04-24 12:10         ` Sumit Gupta
2026-04-24 12:55           ` Pierre Gondois
2026-04-24 13:52             ` Sumit Gupta [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e3cba64-2bd5-4789-b118-95a7c980731d@nvidia.com \
    --to=sumitg@nvidia.com \
    --cc=bbasu@nvidia.com \
    --cc=corbet@lwn.net \
    --cc=ionela.voinescu@arm.com \
    --cc=jonathanh@nvidia.com \
    --cc=ksitaraman@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=mochs@nvidia.com \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=sanjayc@nvidia.com \
    --cc=skhan@linuxfoundation.org \
    --cc=treding@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=vsethi@nvidia.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox