Linux Documentation
 help / color / mirror / Atom feed
From: Mario Limonciello <mario.limonciello@amd.com>
To: Sumit Gupta <sumitg@nvidia.com>,
	rafael@kernel.org, viresh.kumar@linaro.org,
	pierre.gondois@arm.com, ionela.voinescu@arm.com,
	zhenglifeng1@huawei.com, zhanjie9@hisilicon.com, corbet@lwn.net,
	skhan@linuxfoundation.org, rdunlap@infradead.org,
	linux-pm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: linux-tegra@vger.kernel.org, treding@nvidia.com,
	jonathanh@nvidia.com, vsethi@nvidia.com, ksitaraman@nvidia.com,
	sanjayc@nvidia.com, mochs@nvidia.com, bbasu@nvidia.com
Subject: Re: [PATCH v3 2/2] cpufreq: CPPC: add autonomous mode boot parameter support
Date: Mon, 18 May 2026 09:21:05 -0500	[thread overview]
Message-ID: <7d7a6ab6-b1ea-484c-a275-19acca50c483@amd.com> (raw)
In-Reply-To: <e1a546f2-6e7e-4236-97bb-f72bea0137f7@nvidia.com>



On 5/18/26 09:15, Sumit Gupta wrote:
> 
> On 18/05/26 19:20, Mario Limonciello wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> On 5/18/26 08:44, Sumit Gupta wrote:
>>> Hi Mario,
>>>
>>>
>>> On 16/05/26 02:43, Mario Limonciello wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On 5/15/26 07:26, Sumit Gupta wrote:
>>>>> Add a kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable
>>>>> CPPC autonomous performance selection on all CPUs at system startup.
>>>>> When autonomous mode is enabled, the hardware automatically adjusts
>>>>> CPU performance based on workload demands using Energy Performance
>>>>> Preference (EPP) hints.
>>>>>
>>>>> When the parameter is set:
>>>>> - Configure all CPUs for autonomous operation on first init
>>>>> - Use HW min/max_perf when available; otherwise initialize from caps
>>>>> - Initialize desired_perf to max_perf as a starting hint
>>>>> - Hardware controls frequency instead of the OS governor
>>>>> - EPP behavior depends on parameter value:
>>>>>    - performance (or 1): override EPP to performance preference (0x0)
>>>>>    - default_epp (or 2): preserve EPP value programmed by BIOS/ 
>>>>> firmware
>>>>>
>>>>> The boot parameter is applied only during first policy initialization.
>>>>> Skip applying it on CPU hotplug to preserve runtime sysfs 
>>>>> configuration.
>>>>>
>>>>> This patch depends on patch series [1] ("cpufreq: Set policy->min and
>>>>> max as real QoS constraints") so that the policy->min/max set in
>>>>> cppc_cpufreq_cpu_init() are not overridden by cpufreq_set_policy()
>>>>> during init.
>>>>>
>>>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
>>>>> ---
>>>>> [1] https://lore.kernel.org/lkml/20260511135538.522653-1-
>>>>> pierre.gondois@arm.com/
>>>>> ---
>>>>>   .../admin-guide/kernel-parameters.txt         |  16 +++
>>>>>   drivers/cpufreq/cppc_cpufreq.c                | 122 +++++++++++++ 
>>>>> ++++-
>>>>>   2 files changed, 133 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/
>>>>> Documentation/admin-guide/kernel-parameters.txt
>>>>> index 0eb64aab3685..7e4b3a8fd76f 100644
>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>> @@ -1048,6 +1048,22 @@ Kernel parameters
>>>>>                       policy to use. This governor must be registered
>>>>> in the
>>>>>                       kernel before the cpufreq driver probes.
>>>>>
>>>>> +     cppc_cpufreq.auto_sel_mode=
>>>>> +                     [CPU_FREQ] Enable ACPI CPPC autonomous 
>>>>> performance
>>>>> +                     selection. When enabled, hardware automatically
>>>>> adjusts
>>>>> +                     CPU frequency on all CPUs based on workload
>>>>> demands.
>>>>> +                     In Autonomous mode, Energy Performance
>>>>> Preference (EPP)
>>>>> +                     hints guide hardware toward performance (0x0)
>>>>> or energy
>>>>> +                     efficiency (0xff).
>>>>> +                     Requires ACPI CPPC autonomous selection register
>>>>> +                     support.
>>>>> +                     Accepts:
>>>>> +                       performance, 1: enable auto_sel + set EPP to
>>>>> +                                       performance (0x0)
>>>>> +                       default_epp, 2: enable auto_sel, preserve EPP
>>>>> value
>>>>> +                                       programmed by BIOS/firmware
>>>>> +                     Unset: cpufreq governors are used (auto_sel
>>>>> disabled).
>>>>
>>>> Rather than unset doing nothing, have you considered having it take a
>>>> midpoint like 128?  That's what we do in amd-pstate (default to
>>>> balance_performance).  I think it turns into a reasonable balance.
>>>
>>> Thanks for the suggestion.
>>> I can add balance_performance that enables auto_sel with EPP=128 in v4.
>>>
>>> On changing the driver default (no param behavior) to auto enable
>>> balance_performance, it would be good to keep the current behavior for
>>> now since cppc_cpufreq is generic across ARM64/RISC-V platforms where
>>> EPP and Autonomous Selection registers are optional.
>>> A default change would affect existing users relying on governors.
>>>
>>> Thank you,
>>> Sumit Gupta
>>
>> But couldn't you make the "no module parameter set" follow the behavior
>> to only set the registers if they're available?
>>
>> So the systems that support it start using it, the ones that don't it's
>> a NOP.
>>
> 
> Would it work to add balance_performance as a new mode in v4,
> and discuss changing the default separately as a follow-up?
> 

Sure.

> Runtime detection helps for unsupported platforms. But platforms which
> support the registers use OS governors today, and silently switching
> them to autonomous mode on a kernel update is a behavior change for
> existing users. They would also have no way to boot into sw governor.
> 

But hopefully it should be better battery life/responsiveness for those 
scenarios too, right?

> 
> 
>>>
>>>
>>>>
>>>>> +
>>>>>       cpu_init_udelay=N
>>>>>                       [X86,EARLY] Delay for N microsec between assert
>>>>> and de-assert
>>>>>                       of APIC INIT to start processors. This delay
>>>>> occurs
>>>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/
>>>>> cppc_cpufreq.c
>>>>> index 6b54427b52e1..5f4d735e7c7d 100644
>>>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>>>> @@ -28,6 +28,43 @@
>>>>>
>>>>>   static struct cpufreq_driver cppc_cpufreq_driver;
>>>>>
>>>>> +/* Autonomous Selection boot parameter modes */
>>>>> +enum {
>>>>> +     AUTO_SEL_PERFORMANCE = 1,
>>>>> +     AUTO_SEL_DEFAULT_EPP = 2,
>>>>> +};
>>>>> +
>>>>> +static int auto_sel_mode;
>>>>> +
>>>>> +static int auto_sel_mode_set(const char *val, const struct
>>>>> kernel_param *kp)
>>>>> +{
>>>>> +     if (sysfs_streq(val, "performance") || sysfs_streq(val, "1"))
>>>>> +             *(int *)kp->arg = AUTO_SEL_PERFORMANCE;
>>>>> +     else if (sysfs_streq(val, "default_epp") || sysfs_streq(val, 
>>>>> "2"))
>>>>> +             *(int *)kp->arg = AUTO_SEL_DEFAULT_EPP;
>>>>> +     else
>>>>> +             return -EINVAL;
>>>>> +
>>>>> +     return 0;
>>>>> +}
>>>>> +
>>>>> +static int auto_sel_mode_get(char *buffer, const struct kernel_param
>>>>> *kp)
>>>>> +{
>>>>> +     switch (*(int *)kp->arg) {
>>>>> +     case AUTO_SEL_PERFORMANCE:
>>>>> +             return sysfs_emit(buffer, "performance\n");
>>>>> +     case AUTO_SEL_DEFAULT_EPP:
>>>>> +             return sysfs_emit(buffer, "default_epp\n");
>>>>> +     default:
>>>>> +             return sysfs_emit(buffer, "disabled\n");
>>>>> +     }
>>>>> +}
>>>>> +
>>>>> +static const struct kernel_param_ops auto_sel_mode_ops = {
>>>>> +     .set = auto_sel_mode_set,
>>>>> +     .get = auto_sel_mode_get,
>>>>> +};
>>>>> +
>>>>>   #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>>>>>   static enum {
>>>>>       FIE_UNSET = -1,
>>>>> @@ -715,11 +752,75 @@ static int cppc_cpufreq_cpu_init(struct
>>>>> cpufreq_policy *policy)
>>>>>       policy->cur = cppc_perf_to_khz(caps, caps->highest_perf);
>>>>>       cpu_data->perf_ctrls.desired_perf = caps->highest_perf;
>>>>>
>>>>> -     ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>>>> -     if (ret) {
>>>>> -             pr_debug("Err setting perf value:%d on CPU:%d. ret: 
>>>>> %d\n",
>>>>> -                      caps->highest_perf, cpu, ret);
>>>>> -             goto out;
>>>>> +     /*
>>>>> +      * Enable autonomous mode on first init if boot param is set.
>>>>> +      * Check last_governor to detect first init and skip if auto_sel
>>>>> +      * is already enabled.
>>>>> +      */
>>>>> +     if (auto_sel_mode && policy->last_governor[0] == '\0' &&
>>>>> +         !cpu_data->perf_ctrls.auto_sel) {
>>>>> +             /* Init min/max_perf from caps if not already set by
>>>>> HW. */
>>>>> +             if (!cpu_data->perf_ctrls.min_perf)
>>>>> +                     cpu_data->perf_ctrls.min_perf = caps-
>>>>> >lowest_nonlinear_perf;
>>>>> +             if (!cpu_data->perf_ctrls.max_perf)
>>>>> +                     cpu_data->perf_ctrls.max_perf = policy-
>>>>> >boost_enabled ?
>>>>> +                             caps->highest_perf : caps->nominal_perf;
>>>>> +
>>>>> +             /*
>>>>> +              * In autonomous mode desired_perf is only a hint; 
>>>>> EPP and
>>>>> +              * the platform drive actual selection within [min, 
>>>>> max].
>>>>> +              * Initialize it to max_perf so HW starts at the upper
>>>>> bound.
>>>>> +              */
>>>>> +             cpu_data->perf_ctrls.desired_perf = cpu_data-
>>>>> >perf_ctrls.max_perf;
>>>>> +
>>>>> +             policy->cur = cppc_perf_to_khz(caps,
>>>>> + cpu_data->perf_ctrls.desired_perf);
>>>>> +
>>>>> +             /*
>>>>> +              * Override EPP only in 'performance' mode;
>>>>> 'default_epp' mode
>>>>> +              * preserves the BIOS/firmware programmed EPP value.
>>>>> +              * EPP is optional - some platforms may not support it.
>>>>> +              */
>>>>> +             if (auto_sel_mode == AUTO_SEL_PERFORMANCE) {
>>>>> +                     ret = cppc_set_epp(cpu,
>>>>> CPPC_EPP_PERFORMANCE_PREF);
>>>>> +                     if (ret && ret != -EOPNOTSUPP)
>>>>> +                             pr_warn("Failed to set EPP for CPU%d
>>>>> (%d)\n", cpu, ret);
>>>>> +                     else if (!ret)
>>>>> + cpu_data->perf_ctrls.energy_perf = CPPC_EPP_PERFORMANCE_PREF;
>>>>> +             }
>>>>> +
>>>>> +             /* Program min/max/desired into CPPC regs (non-fatal on
>>>>> failure). */
>>>>> +             ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>>>> +             if (ret)
>>>>> +                     pr_warn("set_perf failed CPU%d (%d); using HW
>>>>> values\n",
>>>>> +                             cpu, ret);
>>>>> +
>>>>> +             ret = cppc_set_auto_sel(cpu, true);
>>>>> +             if (ret && ret != -EOPNOTSUPP)
>>>>> +                     pr_warn("auto_sel CPU%d failed (%d); using OS
>>>>> mode\n",
>>>>> +                             cpu, ret);
>>>>> +             else if (!ret)
>>>>> +                     cpu_data->perf_ctrls.auto_sel = true;
>>>>> +     }
>>>>> +
>>>>> +     if (cpu_data->perf_ctrls.auto_sel) {
>>>>> +             /* Sync policy limits from HW when autonomous mode is
>>>>> active */
>>>>> +             policy->min = cppc_perf_to_khz(caps,
>>>>> + cpu_data->perf_ctrls.min_perf ?:
>>>>> + caps->lowest_nonlinear_perf);
>>>>> +             policy->max = cppc_perf_to_khz(caps,
>>>>> + cpu_data->perf_ctrls.max_perf ?:
>>>>> + (policy->boost_enabled ?
>>>>> + caps->highest_perf :
>>>>> + caps->nominal_perf));
>>>>> +     } else {
>>>>> +             /* Normal mode: governors control frequency */
>>>>> +             ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>>>> +             if (ret) {
>>>>> +                     pr_debug("Err setting perf value:%d on CPU:%d.
>>>>> ret:%d\n",
>>>>> +                              caps->highest_perf, cpu, ret);
>>>>> +                     goto out;
>>>>> +             }
>>>>>       }
>>>>>
>>>>>       cppc_cpufreq_cpu_fie_init(policy);
>>>>> @@ -1079,10 +1180,21 @@ static int __init cppc_cpufreq_init(void)
>>>>>
>>>>>   static void __exit cppc_cpufreq_exit(void)
>>>>>   {
>>>>> +     unsigned int cpu;
>>>>> +
>>>>> +     for_each_present_cpu(cpu)
>>>>> +             cppc_set_auto_sel(cpu, false);
>>>>> +
>>>>>       cpufreq_unregister_driver(&cppc_cpufreq_driver);
>>>>>       cppc_freq_invariance_exit();
>>>>>   }
>>>>>
>>>>> +module_param_cb(auto_sel_mode, &auto_sel_mode_ops, &auto_sel_mode,
>>>>> 0444);
>>>>> +MODULE_PARM_DESC(auto_sel_mode,
>>>>> +              "Enable CPPC autonomous performance selection at 
>>>>> boot: "
>>>>> +              "performance or 1 (EPP=performance), "
>>>>> +              "default_epp or 2 (preserve BIOS/firmware EPP)");
>>>>> +
>>>>>   module_exit(cppc_cpufreq_exit);
>>>>>   MODULE_AUTHOR("Ashwin Chaugule");
>>>>>   MODULE_DESCRIPTION("CPUFreq driver based on the ACPI CPPC v5.0+
>>>>> spec");
>>>>
>>


  reply	other threads:[~2026-05-18 14:21 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15 12:26 [PATCH v3 0/2] cpufreq: CPPC: add autonomous mode boot parameter support Sumit Gupta
2026-05-15 12:26 ` [PATCH v3 1/2] cpufreq: CPPC: Set CPPC Enable register in cpu_init Sumit Gupta
2026-05-15 12:26 ` [PATCH v3 2/2] cpufreq: CPPC: add autonomous mode boot parameter support Sumit Gupta
2026-05-15 21:13   ` Mario Limonciello
2026-05-18 13:44     ` Sumit Gupta
2026-05-18 13:50       ` Mario Limonciello
2026-05-18 14:15         ` Sumit Gupta
2026-05-18 14:21           ` Mario Limonciello [this message]
2026-05-18 17:22             ` Sumit Gupta
2026-05-18 18:08               ` Mario Limonciello
2026-05-15 22:14   ` Randy Dunlap
2026-05-18 13:49     ` Sumit Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7d7a6ab6-b1ea-484c-a275-19acca50c483@amd.com \
    --to=mario.limonciello@amd.com \
    --cc=bbasu@nvidia.com \
    --cc=corbet@lwn.net \
    --cc=ionela.voinescu@arm.com \
    --cc=jonathanh@nvidia.com \
    --cc=ksitaraman@nvidia.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mochs@nvidia.com \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=sanjayc@nvidia.com \
    --cc=skhan@linuxfoundation.org \
    --cc=sumitg@nvidia.com \
    --cc=treding@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=vsethi@nvidia.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox