* [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support @ 2026-03-17 15:10 Sumit Gupta 2026-03-24 18:18 ` Pierre Gondois 0 siblings, 1 reply; 10+ messages in thread From: Sumit Gupta @ 2026-03-17 15:10 UTC (permalink / raw) To: rafael, viresh.kumar, pierre.gondois, ionela.voinescu, zhenglifeng1, zhanjie9, corbet, skhan, rdunlap, mario.limonciello, linux-pm, linux-doc, linux-kernel Cc: linux-tegra, treding, jonathanh, vsethi, ksitaraman, sanjayc, nhartman, mochs, bbasu, sumitg Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC autonomous performance selection on all CPUs at system startup without requiring runtime sysfs manipulation. When autonomous mode is enabled, the hardware automatically adjusts CPU performance based on workload demands using Energy Performance Preference (EPP) hints. When auto_sel_mode=1: - Configure all CPUs for autonomous operation on first init - Set EPP to performance preference (0x0) - Use HW min/max when set; otherwise program from policy limits (caps) - Clamp desired_perf to bounds before enabling autonomous mode - Hardware controls frequency instead of the OS governor The boot parameter is applied only during first policy initialization. On hotplug, skip applying it so that the user's runtime sysfs configuration is preserved. Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation) Signed-off-by: Sumit Gupta <sumitg@nvidia.com> --- Part 1 [1] of this series was applied for 7.1 and present in next. Sending this patch as reworked version of 'patch 11' from [2] based on next. [1] https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ [2] https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ --- .../admin-guide/kernel-parameters.txt | 13 +++ drivers/cpufreq/cppc_cpufreq.c | 84 +++++++++++++++++-- 2 files changed, 92 insertions(+), 5 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index fa6171b5fdd5..de4b4c89edfe 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1060,6 +1060,19 @@ Kernel parameters policy to use. This governor must be registered in the kernel before the cpufreq driver probes. + cppc_cpufreq.auto_sel_mode= + [CPU_FREQ] Enable ACPI CPPC autonomous performance + selection. When enabled, hardware automatically adjusts + CPU frequency on all CPUs based on workload demands. + In Autonomous mode, Energy Performance Preference (EPP) + hints guide hardware toward performance (0x0) or energy + efficiency (0xff). + Requires ACPI CPPC autonomous selection register support. + Format: <bool> + Default: 0 (disabled) + 0: use cpufreq governors + 1: enable if supported by hardware + cpu_init_udelay=N [X86,EARLY] Delay for N microsec between assert and de-assert of APIC INIT to start processors. This delay occurs diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c index 5dfb109cf1f4..49c148b2a0a4 100644 --- a/drivers/cpufreq/cppc_cpufreq.c +++ b/drivers/cpufreq/cppc_cpufreq.c @@ -28,6 +28,9 @@ static struct cpufreq_driver cppc_cpufreq_driver; +/* Autonomous Selection boot parameter */ +static bool auto_sel_mode; + #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE static enum { FIE_UNSET = -1, @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy) policy->cur = cppc_perf_to_khz(caps, caps->highest_perf); cpu_data->perf_ctrls.desired_perf = caps->highest_perf; - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); - if (ret) { - pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n", - caps->highest_perf, cpu, ret); - goto out; + /* + * Enable autonomous mode on first init if boot param is set. + * Check last_governor to detect first init and skip if auto_sel + * is already enabled. + */ + if (auto_sel_mode && policy->last_governor[0] == '\0' && + !cpu_data->perf_ctrls.auto_sel) { + /* Enable CPPC - optional register, some platforms need it */ + ret = cppc_set_enable(cpu, true); + if (ret && ret != -EOPNOTSUPP) + pr_warn("Failed to enable CPPC for CPU%d (%d)\n", cpu, ret); + + /* + * Prefer HW min/max_perf when set; otherwise program from + * policy limits derived earlier from caps. + * Clamp desired_perf to bounds and sync policy->cur. + */ + if (!cpu_data->perf_ctrls.min_perf || !cpu_data->perf_ctrls.max_perf) + cppc_cpufreq_update_perf_limits(cpu_data, policy); + + cpu_data->perf_ctrls.desired_perf = + clamp_t(u32, cpu_data->perf_ctrls.desired_perf, + cpu_data->perf_ctrls.min_perf, + cpu_data->perf_ctrls.max_perf); + + policy->cur = cppc_perf_to_khz(caps, + cpu_data->perf_ctrls.desired_perf); + + /* EPP is optional - some platforms may not support it */ + ret = cppc_set_epp(cpu, CPPC_EPP_PERFORMANCE_PREF); + if (ret && ret != -EOPNOTSUPP) + pr_warn("Failed to set EPP for CPU%d (%d)\n", cpu, ret); + else if (!ret) + cpu_data->perf_ctrls.energy_perf = CPPC_EPP_PERFORMANCE_PREF; + + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); + if (ret) { + pr_debug("Err setting perf for autonomous mode CPU:%d ret:%d\n", + cpu, ret); + goto out; + } + + ret = cppc_set_auto_sel(cpu, true); + if (ret && ret != -EOPNOTSUPP) { + pr_warn("Failed autonomous config for CPU%d (%d)\n", + cpu, ret); + goto out; + } + if (!ret) + cpu_data->perf_ctrls.auto_sel = true; + } + + if (cpu_data->perf_ctrls.auto_sel) { + /* Sync policy limits from HW when autonomous mode is active */ + policy->min = cppc_perf_to_khz(caps, + cpu_data->perf_ctrls.min_perf ?: + caps->lowest_nonlinear_perf); + policy->max = cppc_perf_to_khz(caps, + cpu_data->perf_ctrls.max_perf ?: + caps->nominal_perf); + } else { + /* Normal mode: governors control frequency */ + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); + if (ret) { + pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n", + caps->highest_perf, cpu, ret); + goto out; + } } cppc_cpufreq_cpu_fie_init(policy); @@ -1038,10 +1104,18 @@ static int __init cppc_cpufreq_init(void) static void __exit cppc_cpufreq_exit(void) { + unsigned int cpu; + + for_each_present_cpu(cpu) + cppc_set_auto_sel(cpu, false); + cpufreq_unregister_driver(&cppc_cpufreq_driver); cppc_freq_invariance_exit(); } +module_param(auto_sel_mode, bool, 0444); +MODULE_PARM_DESC(auto_sel_mode, "Enable CPPC autonomous performance selection at boot"); + module_exit(cppc_cpufreq_exit); MODULE_AUTHOR("Ashwin Chaugule"); MODULE_DESCRIPTION("CPUFreq driver based on the ACPI CPPC v5.0+ spec"); -- 2.34.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-03-17 15:10 [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support Sumit Gupta @ 2026-03-24 18:18 ` Pierre Gondois 2026-04-06 18:08 ` Sumit Gupta 0 siblings, 1 reply; 10+ messages in thread From: Pierre Gondois @ 2026-03-24 18:18 UTC (permalink / raw) To: Sumit Gupta Cc: linux-tegra, linux-kernel, linux-doc, zhenglifeng1, treding, viresh.kumar, jonathanh, vsethi, ionela.voinescu, ksitaraman, sanjayc, zhanjie9, nhartman, corbet, mochs, skhan, bbasu, rdunlap, linux-pm, mario.limonciello, rafael Hello Sumit, On 3/17/26 16:10, Sumit Gupta wrote: > Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC > autonomous performance selection on all CPUs at system startup without > requiring runtime sysfs manipulation. When autonomous mode is enabled, > the hardware automatically adjusts CPU performance based on workload > demands using Energy Performance Preference (EPP) hints. > > When auto_sel_mode=1: > - Configure all CPUs for autonomous operation on first init > - Set EPP to performance preference (0x0) > - Use HW min/max when set; otherwise program from policy limits (caps) > - Clamp desired_perf to bounds before enabling autonomous mode > - Hardware controls frequency instead of the OS governor > > The boot parameter is applied only during first policy initialization. > On hotplug, skip applying it so that the user's runtime sysfs > configuration is preserved. > > Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation) > Signed-off-by: Sumit Gupta <sumitg@nvidia.com> > --- > Part 1 [1] of this series was applied for 7.1 and present in next. > Sending this patch as reworked version of 'patch 11' from [2] based > on next. > > [1] https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ > [2] https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ > --- > .../admin-guide/kernel-parameters.txt | 13 +++ > drivers/cpufreq/cppc_cpufreq.c | 84 +++++++++++++++++-- > 2 files changed, 92 insertions(+), 5 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index fa6171b5fdd5..de4b4c89edfe 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -1060,6 +1060,19 @@ Kernel parameters > policy to use. This governor must be registered in the > kernel before the cpufreq driver probes. > > + cppc_cpufreq.auto_sel_mode= > + [CPU_FREQ] Enable ACPI CPPC autonomous performance > + selection. When enabled, hardware automatically adjusts > + CPU frequency on all CPUs based on workload demands. > + In Autonomous mode, Energy Performance Preference (EPP) > + hints guide hardware toward performance (0x0) or energy > + efficiency (0xff). > + Requires ACPI CPPC autonomous selection register support. > + Format: <bool> > + Default: 0 (disabled) > + 0: use cpufreq governors > + 1: enable if supported by hardware > + > cpu_init_udelay=N > [X86,EARLY] Delay for N microsec between assert and de-assert > of APIC INIT to start processors. This delay occurs > diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c > index 5dfb109cf1f4..49c148b2a0a4 100644 > --- a/drivers/cpufreq/cppc_cpufreq.c > +++ b/drivers/cpufreq/cppc_cpufreq.c > @@ -28,6 +28,9 @@ > > static struct cpufreq_driver cppc_cpufreq_driver; > > +/* Autonomous Selection boot parameter */ > +static bool auto_sel_mode; > + > #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE > static enum { > FIE_UNSET = -1, > @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy) > policy->cur = cppc_perf_to_khz(caps, caps->highest_perf); > cpu_data->perf_ctrls.desired_perf = caps->highest_perf; > > - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); > - if (ret) { > - pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n", > - caps->highest_perf, cpu, ret); > - goto out; > + /* > + * Enable autonomous mode on first init if boot param is set. > + * Check last_governor to detect first init and skip if auto_sel > + * is already enabled. > + */ If the goal is to set autosel only once at the driver init, shouldn't this be done in cppc_cpufreq_init() ? I understand that cpu_data doesn't exist yet in cppc_cpufreq_init(), but this seems more appropriate to do it there IMO. This means the cpudata should be updated accordingly in this cppc_cpufreq_cpu_init() function. > + if (auto_sel_mode && policy->last_governor[0] == '\0' && > + !cpu_data->perf_ctrls.auto_sel) { > + /* Enable CPPC - optional register, some platforms need it */ The documentation of the CPPC Enable Register is subject to interpretation, but IIUC the field should be set to use the CPPC controls, so I assume this should be set in cppc_cpufreq_init() instead ? > + ret = cppc_set_enable(cpu, true); > + if (ret && ret != -EOPNOTSUPP) > + pr_warn("Failed to enable CPPC for CPU%d (%d)\n", cpu, ret); > + > + /* > + * Prefer HW min/max_perf when set; otherwise program from > + * policy limits derived earlier from caps. > + * Clamp desired_perf to bounds and sync policy->cur. > + */ > + if (!cpu_data->perf_ctrls.min_perf || !cpu_data->perf_ctrls.max_perf) The function doesn't seem to exist. > + cppc_cpufreq_update_perf_limits(cpu_data, policy); > + > + cpu_data->perf_ctrls.desired_perf = > + clamp_t(u32, cpu_data->perf_ctrls.desired_perf, > + cpu_data->perf_ctrls.min_perf, > + cpu_data->perf_ctrls.max_perf); > + > + policy->cur = cppc_perf_to_khz(caps, > + cpu_data->perf_ctrls.desired_perf); > + Maybe this should also be done in cppc_cpufreq_init() if the auto_sel_mode parameter is set ? > + /* EPP is optional - some platforms may not support it */ > + ret = cppc_set_epp(cpu, CPPC_EPP_PERFORMANCE_PREF); > + if (ret && ret != -EOPNOTSUPP) > + pr_warn("Failed to set EPP for CPU%d (%d)\n", cpu, ret); > + else if (!ret) > + cpu_data->perf_ctrls.energy_perf = CPPC_EPP_PERFORMANCE_PREF; > + > + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); > + if (ret) { > + pr_debug("Err setting perf for autonomous mode CPU:%d ret:%d\n", > + cpu, ret); > + goto out; > + } > + > + ret = cppc_set_auto_sel(cpu, true); > + if (ret && ret != -EOPNOTSUPP) { > + pr_warn("Failed autonomous config for CPU%d (%d)\n", > + cpu, ret); > + goto out; > + } > + if (!ret) > + cpu_data->perf_ctrls.auto_sel = true; > + } > + > + if (cpu_data->perf_ctrls.auto_sel) { There is a patchset ongoing which tries to remove setting policy->min/max from driver initialization. Indeed, these values are only temporarily valid, until the governor override them. It is not sure yet the patch will be accepted though. https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/ > + /* Sync policy limits from HW when autonomous mode is active */ > + policy->min = cppc_perf_to_khz(caps, > + cpu_data->perf_ctrls.min_perf ?: > + caps->lowest_nonlinear_perf); > + policy->max = cppc_perf_to_khz(caps, > + cpu_data->perf_ctrls.max_perf ?: > + caps->nominal_perf); > + } else { > + /* Normal mode: governors control frequency */ > + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); > + if (ret) { > + pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n", > + caps->highest_perf, cpu, ret); > + goto out; > + } > } > > cppc_cpufreq_cpu_fie_init(policy); > @@ -1038,10 +1104,18 @@ static int __init cppc_cpufreq_init(void) > > static void __exit cppc_cpufreq_exit(void) > { > + unsigned int cpu; > + > + for_each_present_cpu(cpu) > + cppc_set_auto_sel(cpu, false); If the firmware has a default EPP value, it means that loading and the unloading the driver will reset this default EPP value. Maybe the initial EPP value and/or the auto_sel value should be cached somewhere and restored on exit ? I don't know if this is actually an issue, this is just to signal it. > + > cpufreq_unregister_driver(&cppc_cpufreq_driver); > cppc_freq_invariance_exit(); > } > > +module_param(auto_sel_mode, bool, 0444); > +MODULE_PARM_DESC(auto_sel_mode, "Enable CPPC autonomous performance selection at boot"); > + > module_exit(cppc_cpufreq_exit); > MODULE_AUTHOR("Ashwin Chaugule"); > MODULE_DESCRIPTION("CPUFreq driver based on the ACPI CPPC v5.0+ spec"); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-03-24 18:18 ` Pierre Gondois @ 2026-04-06 18:08 ` Sumit Gupta 2026-04-10 13:47 ` Pierre Gondois 0 siblings, 1 reply; 10+ messages in thread From: Sumit Gupta @ 2026-04-06 18:08 UTC (permalink / raw) To: Pierre Gondois Cc: linux-tegra, linux-kernel, linux-doc, zhenglifeng1, treding, viresh.kumar, jonathanh, vsethi, ionela.voinescu, ksitaraman, sanjayc, zhanjie9, corbet, mochs, skhan, bbasu, rdunlap, linux-pm, mario.limonciello, rafael, sumitg Hi Pierre, Thank you for the comments. Sorry for late reply as I was on vacation. On 24/03/26 23:48, Pierre Gondois wrote: > External email: Use caution opening links or attachments > > > Hello Sumit, > > On 3/17/26 16:10, Sumit Gupta wrote: >> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC >> autonomous performance selection on all CPUs at system startup without >> requiring runtime sysfs manipulation. When autonomous mode is enabled, >> the hardware automatically adjusts CPU performance based on workload >> demands using Energy Performance Preference (EPP) hints. >> >> When auto_sel_mode=1: >> - Configure all CPUs for autonomous operation on first init >> - Set EPP to performance preference (0x0) >> - Use HW min/max when set; otherwise program from policy limits (caps) >> - Clamp desired_perf to bounds before enabling autonomous mode >> - Hardware controls frequency instead of the OS governor >> >> The boot parameter is applied only during first policy initialization. >> On hotplug, skip applying it so that the user's runtime sysfs >> configuration is preserved. >> >> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation) >> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >> --- >> Part 1 [1] of this series was applied for 7.1 and present in next. >> Sending this patch as reworked version of 'patch 11' from [2] based >> on next. >> >> [1] >> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ >> [2] >> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ >> --- >> .../admin-guide/kernel-parameters.txt | 13 +++ >> drivers/cpufreq/cppc_cpufreq.c | 84 +++++++++++++++++-- >> 2 files changed, 92 insertions(+), 5 deletions(-) >> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt >> b/Documentation/admin-guide/kernel-parameters.txt >> index fa6171b5fdd5..de4b4c89edfe 100644 >> --- a/Documentation/admin-guide/kernel-parameters.txt >> +++ b/Documentation/admin-guide/kernel-parameters.txt >> @@ -1060,6 +1060,19 @@ Kernel parameters >> policy to use. This governor must be registered >> in the >> kernel before the cpufreq driver probes. >> >> + cppc_cpufreq.auto_sel_mode= >> + [CPU_FREQ] Enable ACPI CPPC autonomous performance >> + selection. When enabled, hardware automatically >> adjusts >> + CPU frequency on all CPUs based on workload >> demands. >> + In Autonomous mode, Energy Performance >> Preference (EPP) >> + hints guide hardware toward performance (0x0) >> or energy >> + efficiency (0xff). >> + Requires ACPI CPPC autonomous selection >> register support. >> + Format: <bool> >> + Default: 0 (disabled) >> + 0: use cpufreq governors >> + 1: enable if supported by hardware >> + >> cpu_init_udelay=N >> [X86,EARLY] Delay for N microsec between assert >> and de-assert >> of APIC INIT to start processors. This delay >> occurs >> diff --git a/drivers/cpufreq/cppc_cpufreq.c >> b/drivers/cpufreq/cppc_cpufreq.c >> index 5dfb109cf1f4..49c148b2a0a4 100644 >> --- a/drivers/cpufreq/cppc_cpufreq.c >> +++ b/drivers/cpufreq/cppc_cpufreq.c >> @@ -28,6 +28,9 @@ >> >> static struct cpufreq_driver cppc_cpufreq_driver; >> >> +/* Autonomous Selection boot parameter */ >> +static bool auto_sel_mode; >> + >> #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE >> static enum { >> FIE_UNSET = -1, >> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct >> cpufreq_policy *policy) >> policy->cur = cppc_perf_to_khz(caps, caps->highest_perf); >> cpu_data->perf_ctrls.desired_perf = caps->highest_perf; >> >> - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >> - if (ret) { >> - pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n", >> - caps->highest_perf, cpu, ret); >> - goto out; >> + /* >> + * Enable autonomous mode on first init if boot param is set. >> + * Check last_governor to detect first init and skip if auto_sel >> + * is already enabled. >> + */ > If the goal is to set autosel only once at the driver init, > shouldn't this be done in cppc_cpufreq_init() ? > I understand that cpu_data doesn't exist yet in > cppc_cpufreq_init(), but this seems more appropriate to do > it there IMO. > > This means the cpudata should be updated accordingly > in this cppc_cpufreq_cpu_init() function. In an earlier version [1], the setup was in cppc_cpufreq_init() but was moved to cppc_cpufreq_cpu_init() to improve per-CPU error handling. Keeping the setup in cppc_cpufreq_init() helps to avoid the last_governor check. We can warn for a CPU failing to enable and continue so other CPUs keep autonomous mode. cppc_cpufreq_cpu_init() would then just check the auto_sel state from register and sync policy limits from min/max_perf registers when autonomous mode is active. Please let me know your thoughts. [1] https://lore.kernel.org/lkml/5593d364-ca37-41c5-b33f-f7e245d6d626@nvidia.com/ > >> + if (auto_sel_mode && policy->last_governor[0] == '\0' && >> + !cpu_data->perf_ctrls.auto_sel) { >> + /* Enable CPPC - optional register, some platforms need >> it */ > The documentation of the CPPC Enable Register is subject to > interpretation, but IIUC the field should be set to use the CPPC > controls, so I assume this should be set in cppc_cpufreq_init() > instead ? Agree that the CPPC Enable is about using the CPPC control path in general and not only for autonomous selection. Will move cppc_set_enable() into cppc_cpufreq_init() or outside the autonomous mode block in cppc_cpufreq_cpu_init() as per conclusion of previous comment. >> + ret = cppc_set_enable(cpu, true); >> + if (ret && ret != -EOPNOTSUPP) >> + pr_warn("Failed to enable CPPC for CPU%d >> (%d)\n", cpu, ret); >> + >> + /* >> + * Prefer HW min/max_perf when set; otherwise program from >> + * policy limits derived earlier from caps. >> + * Clamp desired_perf to bounds and sync policy->cur. >> + */ >> + if (!cpu_data->perf_ctrls.min_perf || >> !cpu_data->perf_ctrls.max_perf) > > The function doesn't seem to exist. It is newly added in [2]. Don't need to call it if we move the setup to cppc_cpufreq_init(). [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ea3db45ae476889a1ba0ab3617e6afdeeefbda3d > >> + cppc_cpufreq_update_perf_limits(cpu_data, policy); >> + >> + cpu_data->perf_ctrls.desired_perf = >> + clamp_t(u32, cpu_data->perf_ctrls.desired_perf, >> + cpu_data->perf_ctrls.min_perf, >> + cpu_data->perf_ctrls.max_perf); >> + >> + policy->cur = cppc_perf_to_khz(caps, >> + cpu_data->perf_ctrls.desired_perf); >> + > > Maybe this should also be done in cppc_cpufreq_init() > if the auto_sel_mode parameter is set ? Yes. > >> + /* EPP is optional - some platforms may not support it */ >> + ret = cppc_set_epp(cpu, CPPC_EPP_PERFORMANCE_PREF); >> + if (ret && ret != -EOPNOTSUPP) >> + pr_warn("Failed to set EPP for CPU%d (%d)\n", >> cpu, ret); >> + else if (!ret) >> + cpu_data->perf_ctrls.energy_perf = >> CPPC_EPP_PERFORMANCE_PREF; >> + >> + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >> + if (ret) { >> + pr_debug("Err setting perf for autonomous mode >> CPU:%d ret:%d\n", >> + cpu, ret); >> + goto out; >> + } >> + >> + ret = cppc_set_auto_sel(cpu, true); >> + if (ret && ret != -EOPNOTSUPP) { >> + pr_warn("Failed autonomous config for CPU%d >> (%d)\n", >> + cpu, ret); >> + goto out; >> + } >> + if (!ret) >> + cpu_data->perf_ctrls.auto_sel = true; >> + } >> + >> + if (cpu_data->perf_ctrls.auto_sel) { > > There is a patchset ongoing which tries to remove > setting policy->min/max from driver initialization. > Indeed, these values are only temporarily valid, > until the governor override them. > It is not sure yet the patch will be accepted though. > > https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/ > You are right that policy->min/max from .init() are temporary today as cpufreq_set_policy() overwrites them before the governor starts. On my test platform (highest == nominal, lowest_nonlinear == lowest), this had no visible effect because the BIOS bounds and cpuinfo range end up identical. But on platforms where they differ, the governor would widen the range to full cpuinfo limits. I think your patch [3] fixes this by giving these the right semantic as initial QoS requests. With it, cpufreq_set_policy() preserves the policy limits set from min/max_perf registers in .init(), which can either be BIOS values on first boot or last user configured values before hotplug. I will update the comment in v2 to reflect QoS seeding intent. I see that the first two patches of your series [3] is applied for 7.1. Do you plan to send the pending patch (3/4) from [3]? [3] https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/ > > >> + /* Sync policy limits from HW when autonomous mode is >> active */ >> + policy->min = cppc_perf_to_khz(caps, >> + cpu_data->perf_ctrls.min_perf ?: >> + caps->lowest_nonlinear_perf); >> + policy->max = cppc_perf_to_khz(caps, >> + cpu_data->perf_ctrls.max_perf ?: >> + caps->nominal_perf); >> + } else { >> + /* Normal mode: governors control frequency */ >> + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >> + if (ret) { >> + pr_debug("Err setting perf value:%d on CPU:%d. >> ret:%d\n", >> + caps->highest_perf, cpu, ret); >> + goto out; >> + } >> } >> >> cppc_cpufreq_cpu_fie_init(policy); >> @@ -1038,10 +1104,18 @@ static int __init cppc_cpufreq_init(void) >> >> static void __exit cppc_cpufreq_exit(void) >> { >> + unsigned int cpu; >> + >> + for_each_present_cpu(cpu) >> + cppc_set_auto_sel(cpu, false); > > If the firmware has a default EPP value, it means that loading > and the unloading the driver will reset this default EPP value. > Maybe the initial EPP value and/or the auto_sel value should be > cached somewhere and restored on exit ? > I don't know if this is actually an issue, this is just to signal it. The auto_sel_mode boot path programs EPP to performance preference(0), not the firmware’s previous value. On unload we only call cppc_set_auto_sel(false); we do not restore EPP, min/max perf, or other CPPC fields to firmware defaults. Thank you, Sumit Gupta .... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-04-06 18:08 ` Sumit Gupta @ 2026-04-10 13:47 ` Pierre Gondois 2026-04-13 5:51 ` Viresh Kumar 2026-04-20 13:07 ` Sumit Gupta 0 siblings, 2 replies; 10+ messages in thread From: Pierre Gondois @ 2026-04-10 13:47 UTC (permalink / raw) To: Sumit Gupta Cc: linux-tegra, linux-kernel, linux-doc, zhenglifeng1, treding, viresh.kumar, jonathanh, vsethi, ionela.voinescu, ksitaraman, sanjayc, zhanjie9, corbet, mochs, skhan, bbasu, rdunlap, linux-pm, mario.limonciello, rafael Hello Sumit, On 4/6/26 20:08, Sumit Gupta wrote: > Hi Pierre, > > Thank you for the comments. > Sorry for late reply as I was on vacation. > No worries > > On 24/03/26 23:48, Pierre Gondois wrote: >> External email: Use caution opening links or attachments >> >> >> Hello Sumit, >> >> On 3/17/26 16:10, Sumit Gupta wrote: >>> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC >>> autonomous performance selection on all CPUs at system startup without >>> requiring runtime sysfs manipulation. When autonomous mode is enabled, >>> the hardware automatically adjusts CPU performance based on workload >>> demands using Energy Performance Preference (EPP) hints. >>> >>> When auto_sel_mode=1: >>> - Configure all CPUs for autonomous operation on first init >>> - Set EPP to performance preference (0x0) >>> - Use HW min/max when set; otherwise program from policy limits (caps) >>> - Clamp desired_perf to bounds before enabling autonomous mode >>> - Hardware controls frequency instead of the OS governor >>> >>> The boot parameter is applied only during first policy initialization. >>> On hotplug, skip applying it so that the user's runtime sysfs >>> configuration is preserved. >>> >>> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation) >>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >>> --- >>> Part 1 [1] of this series was applied for 7.1 and present in next. >>> Sending this patch as reworked version of 'patch 11' from [2] based >>> on next. >>> >>> [1] >>> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ >>> [2] >>> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ >>> --- >>> .../admin-guide/kernel-parameters.txt | 13 +++ >>> drivers/cpufreq/cppc_cpufreq.c | 84 >>> +++++++++++++++++-- >>> 2 files changed, 92 insertions(+), 5 deletions(-) >>> >>> diff --git a/Documentation/admin-guide/kernel-parameters.txt >>> b/Documentation/admin-guide/kernel-parameters.txt >>> index fa6171b5fdd5..de4b4c89edfe 100644 >>> --- a/Documentation/admin-guide/kernel-parameters.txt >>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>> @@ -1060,6 +1060,19 @@ Kernel parameters >>> policy to use. This governor must be >>> registered in the >>> kernel before the cpufreq driver probes. >>> >>> + cppc_cpufreq.auto_sel_mode= >>> + [CPU_FREQ] Enable ACPI CPPC autonomous >>> performance >>> + selection. When enabled, hardware >>> automatically adjusts >>> + CPU frequency on all CPUs based on workload >>> demands. >>> + In Autonomous mode, Energy Performance >>> Preference (EPP) >>> + hints guide hardware toward performance (0x0) >>> or energy >>> + efficiency (0xff). >>> + Requires ACPI CPPC autonomous selection >>> register support. >>> + Format: <bool> >>> + Default: 0 (disabled) >>> + 0: use cpufreq governors >>> + 1: enable if supported by hardware >>> + >>> cpu_init_udelay=N >>> [X86,EARLY] Delay for N microsec between >>> assert and de-assert >>> of APIC INIT to start processors. This delay >>> occurs >>> diff --git a/drivers/cpufreq/cppc_cpufreq.c >>> b/drivers/cpufreq/cppc_cpufreq.c >>> index 5dfb109cf1f4..49c148b2a0a4 100644 >>> --- a/drivers/cpufreq/cppc_cpufreq.c >>> +++ b/drivers/cpufreq/cppc_cpufreq.c >>> @@ -28,6 +28,9 @@ >>> >>> static struct cpufreq_driver cppc_cpufreq_driver; >>> >>> +/* Autonomous Selection boot parameter */ >>> +static bool auto_sel_mode; >>> + >>> #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE >>> static enum { >>> FIE_UNSET = -1, >>> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct >>> cpufreq_policy *policy) >>> policy->cur = cppc_perf_to_khz(caps, caps->highest_perf); >>> cpu_data->perf_ctrls.desired_perf = caps->highest_perf; >>> >>> - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>> - if (ret) { >>> - pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n", >>> - caps->highest_perf, cpu, ret); >>> - goto out; >>> + /* >>> + * Enable autonomous mode on first init if boot param is set. >>> + * Check last_governor to detect first init and skip if auto_sel >>> + * is already enabled. >>> + */ >> If the goal is to set autosel only once at the driver init, >> shouldn't this be done in cppc_cpufreq_init() ? >> I understand that cpu_data doesn't exist yet in >> cppc_cpufreq_init(), but this seems more appropriate to do >> it there IMO. >> >> This means the cpudata should be updated accordingly >> in this cppc_cpufreq_cpu_init() function. > > In an earlier version [1], the setup was in cppc_cpufreq_init() but > was moved to cppc_cpufreq_cpu_init() to improve per-CPU error handling. > Keeping the setup in cppc_cpufreq_init() helps to avoid the last_governor > check. We can warn for a CPU failing to enable and continue so other > CPUs keep autonomous mode. > cppc_cpufreq_cpu_init() would then just check the auto_sel state > from register and sync policy limits from min/max_perf registers when > autonomous mode is active. > Please let me know your thoughts. FWIU the auto_sel_mode module parameter allows to configure the default auto_sel_mode when the driver is first loaded, so there should not need to check that again whenever cppc_cpufreq_cpu_init() is called. Maybe Ionela saw something we didn't see ? Also just to be sure, should it still be possible to change the auto_sel_mode through the sysfs if the driver was loaded with auto_sel_mode=1 ? > > [1] > https://lore.kernel.org/lkml/5593d364-ca37-41c5-b33f-f7e245d6d626@nvidia.com/ > > >> >>> + if (auto_sel_mode && policy->last_governor[0] == '\0' && >>> + !cpu_data->perf_ctrls.auto_sel) { >>> + /* Enable CPPC - optional register, some platforms >>> need it */ >> The documentation of the CPPC Enable Register is subject to >> interpretation, but IIUC the field should be set to use the CPPC >> controls, so I assume this should be set in cppc_cpufreq_init() >> instead ? > > Agree that the CPPC Enable is about using the CPPC control path > in general and not only for autonomous selection. > Will move cppc_set_enable() into cppc_cpufreq_init() or outside the > autonomous mode block in cppc_cpufreq_cpu_init() as per conclusion > of previous comment. > >>> + ret = cppc_set_enable(cpu, true); >>> + if (ret && ret != -EOPNOTSUPP) >>> + pr_warn("Failed to enable CPPC for CPU%d >>> (%d)\n", cpu, ret); >>> + >>> + /* >>> + * Prefer HW min/max_perf when set; otherwise program >>> from >>> + * policy limits derived earlier from caps. >>> + * Clamp desired_perf to bounds and sync policy->cur. >>> + */ >>> + if (!cpu_data->perf_ctrls.min_perf || >>> !cpu_data->perf_ctrls.max_perf) >> >> The function doesn't seem to exist. > > It is newly added in [2]. > Don't need to call it if we move the setup to cppc_cpufreq_init(). Ah ok right thanks. > > [2] > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ea3db45ae476889a1ba0ab3617e6afdeeefbda3d > > > >> >>> + cppc_cpufreq_update_perf_limits(cpu_data, policy); >>> + >>> + cpu_data->perf_ctrls.desired_perf = >>> + clamp_t(u32, cpu_data->perf_ctrls.desired_perf, >>> + cpu_data->perf_ctrls.min_perf, >>> + cpu_data->perf_ctrls.max_perf); >>> + >>> + policy->cur = cppc_perf_to_khz(caps, >>> + cpu_data->perf_ctrls.desired_perf); >>> + >> >> Maybe this should also be done in cppc_cpufreq_init() >> if the auto_sel_mode parameter is set ? > > Yes. > >> >>> + /* EPP is optional - some platforms may not support it */ >>> + ret = cppc_set_epp(cpu, CPPC_EPP_PERFORMANCE_PREF); >>> + if (ret && ret != -EOPNOTSUPP) >>> + pr_warn("Failed to set EPP for CPU%d (%d)\n", >>> cpu, ret); >>> + else if (!ret) >>> + cpu_data->perf_ctrls.energy_perf = >>> CPPC_EPP_PERFORMANCE_PREF; >>> + >>> + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>> + if (ret) { >>> + pr_debug("Err setting perf for autonomous mode >>> CPU:%d ret:%d\n", >>> + cpu, ret); >>> + goto out; >>> + } >>> + >>> + ret = cppc_set_auto_sel(cpu, true); >>> + if (ret && ret != -EOPNOTSUPP) { >>> + pr_warn("Failed autonomous config for CPU%d >>> (%d)\n", >>> + cpu, ret); >>> + goto out; >>> + } >>> + if (!ret) >>> + cpu_data->perf_ctrls.auto_sel = true; >>> + } >>> + >>> + if (cpu_data->perf_ctrls.auto_sel) { >> >> There is a patchset ongoing which tries to remove >> setting policy->min/max from driver initialization. >> Indeed, these values are only temporarily valid, >> until the governor override them. >> It is not sure yet the patch will be accepted though. >> >> https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/ >> > > > You are right that policy->min/max from .init() are temporary today > as cpufreq_set_policy() overwrites them before the governor starts. > > On my test platform (highest == nominal, lowest_nonlinear == lowest), > this had no visible effect because the BIOS bounds and cpuinfo range > end up identical. But on platforms where they differ, the governor > would widen the range to full cpuinfo limits. > > I think your patch [3] fixes this by giving these the right semantic as > initial QoS requests. With it, cpufreq_set_policy() preserves the policy > limits set from min/max_perf registers in .init(), which can either be > BIOS values on first boot or last user configured values before hotplug. > > I will update the comment in v2 to reflect QoS seeding intent. > > I see that the first two patches of your series [3] is applied for 7.1. > Do you plan to send the pending patch (3/4) from [3]? > I need to ping Viresh to check if this is still relevant. > [3] > https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/ > > >> >> >>> + /* Sync policy limits from HW when autonomous mode is >>> active */ >>> + policy->min = cppc_perf_to_khz(caps, >>> + cpu_data->perf_ctrls.min_perf ?: >>> + caps->lowest_nonlinear_perf); >>> + policy->max = cppc_perf_to_khz(caps, >>> + cpu_data->perf_ctrls.max_perf ?: >>> + caps->nominal_perf); >>> + } else { >>> + /* Normal mode: governors control frequency */ >>> + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>> + if (ret) { >>> + pr_debug("Err setting perf value:%d on CPU:%d. >>> ret:%d\n", >>> + caps->highest_perf, cpu, ret); >>> + goto out; >>> + } >>> } >>> >>> cppc_cpufreq_cpu_fie_init(policy); >>> @@ -1038,10 +1104,18 @@ static int __init cppc_cpufreq_init(void) >>> >>> static void __exit cppc_cpufreq_exit(void) >>> { >>> + unsigned int cpu; >>> + >>> + for_each_present_cpu(cpu) >>> + cppc_set_auto_sel(cpu, false); >> >> If the firmware has a default EPP value, it means that loading >> and the unloading the driver will reset this default EPP value. >> Maybe the initial EPP value and/or the auto_sel value should be >> cached somewhere and restored on exit ? >> I don't know if this is actually an issue, this is just to signal it. > > The auto_sel_mode boot path programs EPP to performance preference(0), > not the firmware’s previous value. On unload we only call > cppc_set_auto_sel(false); we do not restore EPP, min/max perf, > or other CPPC fields to firmware defaults. Yes right, so loading/unloading the driver might change the default EPP value. > > Thank you, > Sumit Gupta > > .... > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-04-10 13:47 ` Pierre Gondois @ 2026-04-13 5:51 ` Viresh Kumar 2026-04-20 13:13 ` Sumit Gupta 2026-04-20 13:07 ` Sumit Gupta 1 sibling, 1 reply; 10+ messages in thread From: Viresh Kumar @ 2026-04-13 5:51 UTC (permalink / raw) To: Pierre Gondois Cc: Sumit Gupta, linux-tegra, linux-kernel, linux-doc, zhenglifeng1, treding, jonathanh, vsethi, ionela.voinescu, ksitaraman, sanjayc, zhanjie9, corbet, mochs, skhan, bbasu, rdunlap, linux-pm, mario.limonciello, rafael On 10-04-26, 15:47, Pierre Gondois wrote: > I need to ping Viresh to check if this is still relevant. I think its okay to clear the min/max state in the kernel once and for all if you think it is not done nicely. As discussed earlier, try that in a fresh series which only does that part. -- viresh ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-04-13 5:51 ` Viresh Kumar @ 2026-04-20 13:13 ` Sumit Gupta 0 siblings, 0 replies; 10+ messages in thread From: Sumit Gupta @ 2026-04-20 13:13 UTC (permalink / raw) To: Viresh Kumar, Pierre Gondois Cc: linux-tegra, linux-kernel, linux-doc, zhenglifeng1, treding, jonathanh, vsethi, ionela.voinescu, ksitaraman, sanjayc, zhanjie9, corbet, mochs, skhan, bbasu, rdunlap, linux-pm, mario.limonciello, rafael, sumitg On 13/04/26 11:21, Viresh Kumar wrote: > External email: Use caution opening links or attachments > > > On 10-04-26, 15:47, Pierre Gondois wrote: >> I need to ping Viresh to check if this is still relevant. > I think its okay to clear the min/max state in the kernel once and for all if > you think it is not done nicely. As discussed earlier, try that in a fresh > series which only does that part. > > -- > viresh Thanks Pierre and Viresh. In autonomous mode, the min/max_perf HW registers directly control the frequency range the hardware operates in, so the values programmed in .init() need to survive through the governor. I verified this on a platform where lowest_nonlinear_perf != lowest_perf, the min_perf register ends up at lowest_perf instead of the intended lowest_nonlinear_perf. Pierre's QoS seeding patch would fix this. Happy to test once it's sent. Thank you, Sumit Gupta ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-04-10 13:47 ` Pierre Gondois 2026-04-13 5:51 ` Viresh Kumar @ 2026-04-20 13:07 ` Sumit Gupta 2026-04-24 12:10 ` Sumit Gupta 1 sibling, 1 reply; 10+ messages in thread From: Sumit Gupta @ 2026-04-20 13:07 UTC (permalink / raw) To: Pierre Gondois Cc: linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, zhenglifeng1@huawei.com, Thierry Reding, viresh.kumar@linaro.org, Jon Hunter, Vikram Sethi, ionela.voinescu@arm.com, Krishna Sitaraman, Sanjay Chandrashekara, zhanjie9@hisilicon.com, corbet@lwn.net, Matt Ochs, skhan@linuxfoundation.org, Bibek Basu, rdunlap@infradead.org, linux-pm@vger.kernel.org, mario.limonciello@amd.com, rafael@kernel.org, sumitg >>> On 3/17/26 16:10, Sumit Gupta wrote: >>>> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC >>>> autonomous performance selection on all CPUs at system startup without >>>> requiring runtime sysfs manipulation. When autonomous mode is enabled, >>>> the hardware automatically adjusts CPU performance based on workload >>>> demands using Energy Performance Preference (EPP) hints. >>>> >>>> When auto_sel_mode=1: >>>> - Configure all CPUs for autonomous operation on first init >>>> - Set EPP to performance preference (0x0) >>>> - Use HW min/max when set; otherwise program from policy limits (caps) >>>> - Clamp desired_perf to bounds before enabling autonomous mode >>>> - Hardware controls frequency instead of the OS governor >>>> >>>> The boot parameter is applied only during first policy initialization. >>>> On hotplug, skip applying it so that the user's runtime sysfs >>>> configuration is preserved. >>>> >>>> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation) >>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >>>> --- >>>> Part 1 [1] of this series was applied for 7.1 and present in next. >>>> Sending this patch as reworked version of 'patch 11' from [2] based >>>> on next. >>>> >>>> [1] >>>> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ >>>> [2] >>>> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ >>>> --- >>>> .../admin-guide/kernel-parameters.txt | 13 +++ >>>> drivers/cpufreq/cppc_cpufreq.c | 84 >>>> +++++++++++++++++-- >>>> 2 files changed, 92 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt >>>> b/Documentation/admin-guide/kernel-parameters.txt >>>> index fa6171b5fdd5..de4b4c89edfe 100644 >>>> --- a/Documentation/admin-guide/kernel-parameters.txt >>>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>>> @@ -1060,6 +1060,19 @@ Kernel parameters >>>> policy to use. This governor must be >>>> registered in the >>>> kernel before the cpufreq driver probes. >>>> >>>> + cppc_cpufreq.auto_sel_mode= >>>> + [CPU_FREQ] Enable ACPI CPPC autonomous >>>> performance >>>> + selection. When enabled, hardware >>>> automatically adjusts >>>> + CPU frequency on all CPUs based on workload >>>> demands. >>>> + In Autonomous mode, Energy Performance >>>> Preference (EPP) >>>> + hints guide hardware toward performance (0x0) >>>> or energy >>>> + efficiency (0xff). >>>> + Requires ACPI CPPC autonomous selection >>>> register support. >>>> + Format: <bool> >>>> + Default: 0 (disabled) >>>> + 0: use cpufreq governors >>>> + 1: enable if supported by hardware >>>> + >>>> cpu_init_udelay=N >>>> [X86,EARLY] Delay for N microsec between >>>> assert and de-assert >>>> of APIC INIT to start processors. This delay >>>> occurs >>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c >>>> b/drivers/cpufreq/cppc_cpufreq.c >>>> index 5dfb109cf1f4..49c148b2a0a4 100644 >>>> --- a/drivers/cpufreq/cppc_cpufreq.c >>>> +++ b/drivers/cpufreq/cppc_cpufreq.c >>>> @@ -28,6 +28,9 @@ >>>> >>>> static struct cpufreq_driver cppc_cpufreq_driver; >>>> >>>> +/* Autonomous Selection boot parameter */ >>>> +static bool auto_sel_mode; >>>> + >>>> #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE >>>> static enum { >>>> FIE_UNSET = -1, >>>> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct >>>> cpufreq_policy *policy) >>>> policy->cur = cppc_perf_to_khz(caps, caps->highest_perf); >>>> cpu_data->perf_ctrls.desired_perf = caps->highest_perf; >>>> >>>> - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>>> - if (ret) { >>>> - pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n", >>>> - caps->highest_perf, cpu, ret); >>>> - goto out; >>>> + /* >>>> + * Enable autonomous mode on first init if boot param is set. >>>> + * Check last_governor to detect first init and skip if auto_sel >>>> + * is already enabled. >>>> + */ >>> If the goal is to set autosel only once at the driver init, >>> shouldn't this be done in cppc_cpufreq_init() ? >>> I understand that cpu_data doesn't exist yet in >>> cppc_cpufreq_init(), but this seems more appropriate to do >>> it there IMO. >>> >>> This means the cpudata should be updated accordingly >>> in this cppc_cpufreq_cpu_init() function. >> In an earlier version [1], the setup was in cppc_cpufreq_init() but >> was moved to cppc_cpufreq_cpu_init() to improve per-CPU error handling. >> Keeping the setup in cppc_cpufreq_init() helps to avoid the last_governor >> check. We can warn for a CPU failing to enable and continue so other >> CPUs keep autonomous mode. >> cppc_cpufreq_cpu_init() would then just check the auto_sel state >> from register and sync policy limits from min/max_perf registers when >> autonomous mode is active. >> Please let me know your thoughts. > FWIU the auto_sel_mode module parameter allows to > configure the default auto_sel_mode when the driver is > first loaded, so there should not need to check that again > whenever cppc_cpufreq_cpu_init() is called. > Maybe Ionela saw something we didn't see ? AFAIU, the concern in that review [1] was about error handling as the earlier version disabled auto_sel on all CPUs if any single CPU failed. Per-CPU error handling in cppc_cpufreq_init() (warn and continue) addresses that. Can't think of more reason. Do you have anything in mind? > > Also just to be sure, should it still be possible to change > the auto_sel_mode through the sysfs if the driver was > loaded with auto_sel_mode=1 ? > Yes, the per-CPU auto_select sysfs attribute works independently of the boot param. Users can enable or disable auto_sel on any CPU at runtime via sysfs, regardless of how the driver was loaded. The boot param only sets the initial state. >> [1] >> https://lore.kernel.org/lkml/5593d364-ca37-41c5-b33f-f7e245d6d626@nvidia.com/ >> >> >>>> + if (auto_sel_mode && policy->last_governor[0] == '\0' && >>>> + !cpu_data->perf_ctrls.auto_sel) { >>>> + /* Enable CPPC - optional register, some platforms >>>> need it */ >>> The documentation of the CPPC Enable Register is subject to >>> interpretation, but IIUC the field should be set to use the CPPC >>> controls, so I assume this should be set in cppc_cpufreq_init() >>> instead ? >> Agree that the CPPC Enable is about using the CPPC control path >> in general and not only for autonomous selection. >> Will move cppc_set_enable() into cppc_cpufreq_init() or outside the >> autonomous mode block in cppc_cpufreq_cpu_init() as per conclusion >> of previous comment. >> >>>> + ret = cppc_set_enable(cpu, true); >>>> + if (ret && ret != -EOPNOTSUPP) >>>> + pr_warn("Failed to enable CPPC for CPU%d >>>> (%d)\n", cpu, ret); >>>> + >>>> + /* >>>> + * Prefer HW min/max_perf when set; otherwise program >>>> from >>>> + * policy limits derived earlier from caps. >>>> + * Clamp desired_perf to bounds and sync policy->cur. >>>> + */ >>>> + if (!cpu_data->perf_ctrls.min_perf || >>>> !cpu_data->perf_ctrls.max_perf) >>> The function doesn't seem to exist. >> It is newly added in [2]. >> Don't need to call it if we move the setup to cppc_cpufreq_init(). > Ah ok right thanks. > > >> [2] >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ea3db45ae476889a1ba0ab3617e6afdeeefbda3d >> >> >> >>>> + cppc_cpufreq_update_perf_limits(cpu_data, policy); >>>> + >>>> + cpu_data->perf_ctrls.desired_perf = >>>> + clamp_t(u32, cpu_data->perf_ctrls.desired_perf, >>>> + cpu_data->perf_ctrls.min_perf, >>>> + cpu_data->perf_ctrls.max_perf); >>>> + >>>> + policy->cur = cppc_perf_to_khz(caps, >>>> + cpu_data->perf_ctrls.desired_perf); >>>> + >>> Maybe this should also be done in cppc_cpufreq_init() >>> if the auto_sel_mode parameter is set ? >> Yes. >> >>>> + /* EPP is optional - some platforms may not support it */ >>>> + ret = cppc_set_epp(cpu, CPPC_EPP_PERFORMANCE_PREF); >>>> + if (ret && ret != -EOPNOTSUPP) >>>> + pr_warn("Failed to set EPP for CPU%d (%d)\n", >>>> cpu, ret); >>>> + else if (!ret) >>>> + cpu_data->perf_ctrls.energy_perf = >>>> CPPC_EPP_PERFORMANCE_PREF; >>>> + >>>> + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>>> + if (ret) { >>>> + pr_debug("Err setting perf for autonomous mode >>>> CPU:%d ret:%d\n", >>>> + cpu, ret); >>>> + goto out; >>>> + } >>>> + >>>> + ret = cppc_set_auto_sel(cpu, true); >>>> + if (ret && ret != -EOPNOTSUPP) { >>>> + pr_warn("Failed autonomous config for CPU%d >>>> (%d)\n", >>>> + cpu, ret); >>>> + goto out; >>>> + } >>>> + if (!ret) >>>> + cpu_data->perf_ctrls.auto_sel = true; >>>> + } >>>> + >>>> + if (cpu_data->perf_ctrls.auto_sel) { >>> There is a patchset ongoing which tries to remove >>> setting policy->min/max from driver initialization. >>> Indeed, these values are only temporarily valid, >>> until the governor override them. >>> It is not sure yet the patch will be accepted though. >>> >>> https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/ >>> >> >> You are right that policy->min/max from .init() are temporary today >> as cpufreq_set_policy() overwrites them before the governor starts. >> >> On my test platform (highest == nominal, lowest_nonlinear == lowest), >> this had no visible effect because the BIOS bounds and cpuinfo range >> end up identical. But on platforms where they differ, the governor >> would widen the range to full cpuinfo limits. >> >> I think your patch [3] fixes this by giving these the right semantic as >> initial QoS requests. With it, cpufreq_set_policy() preserves the policy >> limits set from min/max_perf registers in .init(), which can either be >> BIOS values on first boot or last user configured values before hotplug. >> >> I will update the comment in v2 to reflect QoS seeding intent. >> >> I see that the first two patches of your series [3] is applied for 7.1. >> Do you plan to send the pending patch (3/4) from [3]? >> > I need to ping Viresh to check if this is still relevant. > > >> [3] >> https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/ >> >> >>> >>>> + /* Sync policy limits from HW when autonomous mode is >>>> active */ >>>> + policy->min = cppc_perf_to_khz(caps, >>>> + cpu_data->perf_ctrls.min_perf ?: >>>> + caps->lowest_nonlinear_perf); >>>> + policy->max = cppc_perf_to_khz(caps, >>>> + cpu_data->perf_ctrls.max_perf ?: >>>> + caps->nominal_perf); >>>> + } else { >>>> + /* Normal mode: governors control frequency */ >>>> + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>>> + if (ret) { >>>> + pr_debug("Err setting perf value:%d on CPU:%d. >>>> ret:%d\n", >>>> + caps->highest_perf, cpu, ret); >>>> + goto out; >>>> + } >>>> } >>>> >>>> cppc_cpufreq_cpu_fie_init(policy); >>>> @@ -1038,10 +1104,18 @@ static int __init cppc_cpufreq_init(void) >>>> >>>> static void __exit cppc_cpufreq_exit(void) >>>> { >>>> + unsigned int cpu; >>>> + >>>> + for_each_present_cpu(cpu) >>>> + cppc_set_auto_sel(cpu, false); >>> If the firmware has a default EPP value, it means that loading >>> and the unloading the driver will reset this default EPP value. >>> Maybe the initial EPP value and/or the auto_sel value should be >>> cached somewhere and restored on exit ? >>> I don't know if this is actually an issue, this is just to signal it. >> The auto_sel_mode boot path programs EPP to performance preference(0), >> not the firmware’s previous value. On unload we only call >> cppc_set_auto_sel(false); we do not restore EPP, min/max perf, >> or other CPPC fields to firmware defaults. > Yes right, so loading/unloading the driver might change the > default EPP value. Acknowledged. With auto_sel_mode, load/unload can leave EPP and other CPPC fields different from the firmware defaults at boot. We can add explicit cache-and-restore on exit in a follow-up if that is desired. Thank you, Sumit Gupta ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-04-20 13:07 ` Sumit Gupta @ 2026-04-24 12:10 ` Sumit Gupta 2026-04-24 12:55 ` Pierre Gondois 0 siblings, 1 reply; 10+ messages in thread From: Sumit Gupta @ 2026-04-24 12:10 UTC (permalink / raw) To: Pierre Gondois Cc: linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, zhenglifeng1@huawei.com, Thierry Reding, viresh.kumar@linaro.org, Jon Hunter, Vikram Sethi, ionela.voinescu@arm.com, Krishna Sitaraman, Sanjay Chandrashekara, zhanjie9@hisilicon.com, corbet@lwn.net, Matt Ochs, skhan@linuxfoundation.org, Bibek Basu, rdunlap@infradead.org, linux-pm@vger.kernel.org, mario.limonciello@amd.com, rafael@kernel.org, sumitg On 20/04/26 18:37, Sumit Gupta wrote: > >>>> On 3/17/26 16:10, Sumit Gupta wrote: >>>>> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC >>>>> autonomous performance selection on all CPUs at system startup >>>>> without >>>>> requiring runtime sysfs manipulation. When autonomous mode is >>>>> enabled, >>>>> the hardware automatically adjusts CPU performance based on workload >>>>> demands using Energy Performance Preference (EPP) hints. >>>>> >>>>> When auto_sel_mode=1: >>>>> - Configure all CPUs for autonomous operation on first init >>>>> - Set EPP to performance preference (0x0) >>>>> - Use HW min/max when set; otherwise program from policy limits >>>>> (caps) >>>>> - Clamp desired_perf to bounds before enabling autonomous mode >>>>> - Hardware controls frequency instead of the OS governor >>>>> >>>>> The boot parameter is applied only during first policy >>>>> initialization. >>>>> On hotplug, skip applying it so that the user's runtime sysfs >>>>> configuration is preserved. >>>>> >>>>> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation) >>>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >>>>> --- >>>>> Part 1 [1] of this series was applied for 7.1 and present in next. >>>>> Sending this patch as reworked version of 'patch 11' from [2] based >>>>> on next. >>>>> >>>>> [1] >>>>> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ >>>>> >>>>> [2] >>>>> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ >>>>> >>>>> --- >>>>> .../admin-guide/kernel-parameters.txt | 13 +++ >>>>> drivers/cpufreq/cppc_cpufreq.c | 84 >>>>> +++++++++++++++++-- >>>>> 2 files changed, 92 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt >>>>> b/Documentation/admin-guide/kernel-parameters.txt >>>>> index fa6171b5fdd5..de4b4c89edfe 100644 >>>>> --- a/Documentation/admin-guide/kernel-parameters.txt >>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>>>> @@ -1060,6 +1060,19 @@ Kernel parameters >>>>> policy to use. This governor must be >>>>> registered in the >>>>> kernel before the cpufreq driver probes. >>>>> >>>>> + cppc_cpufreq.auto_sel_mode= >>>>> + [CPU_FREQ] Enable ACPI CPPC autonomous >>>>> performance >>>>> + selection. When enabled, hardware >>>>> automatically adjusts >>>>> + CPU frequency on all CPUs based on workload >>>>> demands. >>>>> + In Autonomous mode, Energy Performance >>>>> Preference (EPP) >>>>> + hints guide hardware toward performance (0x0) >>>>> or energy >>>>> + efficiency (0xff). >>>>> + Requires ACPI CPPC autonomous selection >>>>> register support. >>>>> + Format: <bool> >>>>> + Default: 0 (disabled) >>>>> + 0: use cpufreq governors >>>>> + 1: enable if supported by hardware >>>>> + >>>>> cpu_init_udelay=N >>>>> [X86,EARLY] Delay for N microsec between >>>>> assert and de-assert >>>>> of APIC INIT to start processors. This delay >>>>> occurs >>>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c >>>>> b/drivers/cpufreq/cppc_cpufreq.c >>>>> index 5dfb109cf1f4..49c148b2a0a4 100644 >>>>> --- a/drivers/cpufreq/cppc_cpufreq.c >>>>> +++ b/drivers/cpufreq/cppc_cpufreq.c >>>>> @@ -28,6 +28,9 @@ >>>>> >>>>> static struct cpufreq_driver cppc_cpufreq_driver; >>>>> >>>>> +/* Autonomous Selection boot parameter */ >>>>> +static bool auto_sel_mode; >>>>> + >>>>> #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE >>>>> static enum { >>>>> FIE_UNSET = -1, >>>>> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct >>>>> cpufreq_policy *policy) >>>>> policy->cur = cppc_perf_to_khz(caps, caps->highest_perf); >>>>> cpu_data->perf_ctrls.desired_perf = caps->highest_perf; >>>>> >>>>> - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>>>> - if (ret) { >>>>> - pr_debug("Err setting perf value:%d on CPU:%d. >>>>> ret:%d\n", >>>>> - caps->highest_perf, cpu, ret); >>>>> - goto out; >>>>> + /* >>>>> + * Enable autonomous mode on first init if boot param is set. >>>>> + * Check last_governor to detect first init and skip if >>>>> auto_sel >>>>> + * is already enabled. >>>>> + */ >>>> If the goal is to set autosel only once at the driver init, >>>> shouldn't this be done in cppc_cpufreq_init() ? >>>> I understand that cpu_data doesn't exist yet in >>>> cppc_cpufreq_init(), but this seems more appropriate to do >>>> it there IMO. >>>> >>>> This means the cpudata should be updated accordingly >>>> in this cppc_cpufreq_cpu_init() function. >>> In an earlier version [1], the setup was in cppc_cpufreq_init() but >>> was moved to cppc_cpufreq_cpu_init() to improve per-CPU error handling. >>> Keeping the setup in cppc_cpufreq_init() helps to avoid the >>> last_governor >>> check. We can warn for a CPU failing to enable and continue so other >>> CPUs keep autonomous mode. >>> cppc_cpufreq_cpu_init() would then just check the auto_sel state >>> from register and sync policy limits from min/max_perf registers when >>> autonomous mode is active. >>> Please let me know your thoughts. >> FWIU the auto_sel_mode module parameter allows to >> configure the default auto_sel_mode when the driver is >> first loaded, so there should not need to check that again >> whenever cppc_cpufreq_cpu_init() is called. >> Maybe Ionela saw something we didn't see ? > > AFAIU, the concern in that review [1] was about error handling as the > earlier version disabled auto_sel on all CPUs if any single CPU failed. > Per-CPU error handling in cppc_cpufreq_init() (warn and continue) > addresses that. Can't think of more reason. > Do you have anything in mind? > Actually, one case where cppc_cpufreq_cpu_init() would be needed is when CPUs are offline at boot. So I will keep the setup in cppc_cpufreq_cpu_init() in v2 same as present in current version. Thank you, Sumit Gupta .... ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-04-24 12:10 ` Sumit Gupta @ 2026-04-24 12:55 ` Pierre Gondois 2026-04-24 13:52 ` Sumit Gupta 0 siblings, 1 reply; 10+ messages in thread From: Pierre Gondois @ 2026-04-24 12:55 UTC (permalink / raw) To: Sumit Gupta Cc: linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, zhenglifeng1@huawei.com, Thierry Reding, viresh.kumar@linaro.org, Jon Hunter, Vikram Sethi, ionela.voinescu@arm.com, Krishna Sitaraman, Sanjay Chandrashekara, zhanjie9@hisilicon.com, corbet@lwn.net, Matt Ochs, skhan@linuxfoundation.org, Bibek Basu, rdunlap@infradead.org, linux-pm@vger.kernel.org, mario.limonciello@amd.com, rafael@kernel.org On 4/24/26 14:10, Sumit Gupta wrote: > > On 20/04/26 18:37, Sumit Gupta wrote: >> >>>>> On 3/17/26 16:10, Sumit Gupta wrote: >>>>>> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable >>>>>> CPPC >>>>>> autonomous performance selection on all CPUs at system startup >>>>>> without >>>>>> requiring runtime sysfs manipulation. When autonomous mode is >>>>>> enabled, >>>>>> the hardware automatically adjusts CPU performance based on workload >>>>>> demands using Energy Performance Preference (EPP) hints. >>>>>> >>>>>> When auto_sel_mode=1: >>>>>> - Configure all CPUs for autonomous operation on first init >>>>>> - Set EPP to performance preference (0x0) >>>>>> - Use HW min/max when set; otherwise program from policy limits >>>>>> (caps) >>>>>> - Clamp desired_perf to bounds before enabling autonomous mode >>>>>> - Hardware controls frequency instead of the OS governor >>>>>> >>>>>> The boot parameter is applied only during first policy >>>>>> initialization. >>>>>> On hotplug, skip applying it so that the user's runtime sysfs >>>>>> configuration is preserved. >>>>>> >>>>>> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation) >>>>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >>>>>> --- >>>>>> Part 1 [1] of this series was applied for 7.1 and present in next. >>>>>> Sending this patch as reworked version of 'patch 11' from [2] based >>>>>> on next. >>>>>> >>>>>> [1] >>>>>> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ >>>>>> >>>>>> [2] >>>>>> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ >>>>>> >>>>>> --- >>>>>> .../admin-guide/kernel-parameters.txt | 13 +++ >>>>>> drivers/cpufreq/cppc_cpufreq.c | 84 >>>>>> +++++++++++++++++-- >>>>>> 2 files changed, 92 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt >>>>>> b/Documentation/admin-guide/kernel-parameters.txt >>>>>> index fa6171b5fdd5..de4b4c89edfe 100644 >>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt >>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>>>>> @@ -1060,6 +1060,19 @@ Kernel parameters >>>>>> policy to use. This governor must be >>>>>> registered in the >>>>>> kernel before the cpufreq driver probes. >>>>>> >>>>>> + cppc_cpufreq.auto_sel_mode= >>>>>> + [CPU_FREQ] Enable ACPI CPPC autonomous >>>>>> performance >>>>>> + selection. When enabled, hardware >>>>>> automatically adjusts >>>>>> + CPU frequency on all CPUs based on workload >>>>>> demands. >>>>>> + In Autonomous mode, Energy Performance >>>>>> Preference (EPP) >>>>>> + hints guide hardware toward performance (0x0) >>>>>> or energy >>>>>> + efficiency (0xff). >>>>>> + Requires ACPI CPPC autonomous selection >>>>>> register support. >>>>>> + Format: <bool> >>>>>> + Default: 0 (disabled) >>>>>> + 0: use cpufreq governors >>>>>> + 1: enable if supported by hardware >>>>>> + >>>>>> cpu_init_udelay=N >>>>>> [X86,EARLY] Delay for N microsec between >>>>>> assert and de-assert >>>>>> of APIC INIT to start processors. This delay >>>>>> occurs >>>>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c >>>>>> b/drivers/cpufreq/cppc_cpufreq.c >>>>>> index 5dfb109cf1f4..49c148b2a0a4 100644 >>>>>> --- a/drivers/cpufreq/cppc_cpufreq.c >>>>>> +++ b/drivers/cpufreq/cppc_cpufreq.c >>>>>> @@ -28,6 +28,9 @@ >>>>>> >>>>>> static struct cpufreq_driver cppc_cpufreq_driver; >>>>>> >>>>>> +/* Autonomous Selection boot parameter */ >>>>>> +static bool auto_sel_mode; >>>>>> + >>>>>> #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE >>>>>> static enum { >>>>>> FIE_UNSET = -1, >>>>>> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct >>>>>> cpufreq_policy *policy) >>>>>> policy->cur = cppc_perf_to_khz(caps, caps->highest_perf); >>>>>> cpu_data->perf_ctrls.desired_perf = caps->highest_perf; >>>>>> >>>>>> - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>>>>> - if (ret) { >>>>>> - pr_debug("Err setting perf value:%d on CPU:%d. >>>>>> ret:%d\n", >>>>>> - caps->highest_perf, cpu, ret); >>>>>> - goto out; >>>>>> + /* >>>>>> + * Enable autonomous mode on first init if boot param is set. >>>>>> + * Check last_governor to detect first init and skip if >>>>>> auto_sel >>>>>> + * is already enabled. >>>>>> + */ >>>>> If the goal is to set autosel only once at the driver init, >>>>> shouldn't this be done in cppc_cpufreq_init() ? >>>>> I understand that cpu_data doesn't exist yet in >>>>> cppc_cpufreq_init(), but this seems more appropriate to do >>>>> it there IMO. >>>>> >>>>> This means the cpudata should be updated accordingly >>>>> in this cppc_cpufreq_cpu_init() function. >>>> In an earlier version [1], the setup was in cppc_cpufreq_init() but >>>> was moved to cppc_cpufreq_cpu_init() to improve per-CPU error >>>> handling. >>>> Keeping the setup in cppc_cpufreq_init() helps to avoid the >>>> last_governor >>>> check. We can warn for a CPU failing to enable and continue so other >>>> CPUs keep autonomous mode. >>>> cppc_cpufreq_cpu_init() would then just check the auto_sel state >>>> from register and sync policy limits from min/max_perf registers when >>>> autonomous mode is active. >>>> Please let me know your thoughts. >>> FWIU the auto_sel_mode module parameter allows to >>> configure the default auto_sel_mode when the driver is >>> first loaded, so there should not need to check that again >>> whenever cppc_cpufreq_cpu_init() is called. >>> Maybe Ionela saw something we didn't see ? >> >> AFAIU, the concern in that review [1] was about error handling as the >> earlier version disabled auto_sel on all CPUs if any single CPU failed. >> Per-CPU error handling in cppc_cpufreq_init() (warn and continue) >> addresses that. Can't think of more reason. >> Do you have anything in mind? >> > > Actually, one case where cppc_cpufreq_cpu_init() would be needed > is when CPUs are offline at boot. So I will keep the setup in > cppc_cpufreq_cpu_init() in v2 same as present in current version. > Wouldn't it be possible to loop over the "cpu_present_mask" as you currently do in cppc_cpufreq_exit() ? ------ Another issue about relying on "cpu_data->perf_ctrls.auto_sel" in: """ if (auto_sel_mode && policy->last_governor[0] == '\0' && !cpu_data->perf_ctrls.auto_sel) { """ is that the cpu_data struct is fresh memory coming from cppc_cpufreq_get_cpu_data(), so it might always be 0 I think. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support 2026-04-24 12:55 ` Pierre Gondois @ 2026-04-24 13:52 ` Sumit Gupta 0 siblings, 0 replies; 10+ messages in thread From: Sumit Gupta @ 2026-04-24 13:52 UTC (permalink / raw) To: Pierre Gondois Cc: linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, zhenglifeng1@huawei.com, Thierry Reding, viresh.kumar@linaro.org, Jon Hunter, Vikram Sethi, ionela.voinescu@arm.com, Krishna Sitaraman, Sanjay Chandrashekara, zhanjie9@hisilicon.com, corbet@lwn.net, Matt Ochs, skhan@linuxfoundation.org, Bibek Basu, rdunlap@infradead.org, linux-pm@vger.kernel.org, mario.limonciello@amd.com, rafael@kernel.org, sumitg On 24/04/26 18:25, Pierre Gondois wrote: > External email: Use caution opening links or attachments > > > On 4/24/26 14:10, Sumit Gupta wrote: >> >> On 20/04/26 18:37, Sumit Gupta wrote: >>> >>>>>> On 3/17/26 16:10, Sumit Gupta wrote: >>>>>>> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable >>>>>>> CPPC >>>>>>> autonomous performance selection on all CPUs at system startup >>>>>>> without >>>>>>> requiring runtime sysfs manipulation. When autonomous mode is >>>>>>> enabled, >>>>>>> the hardware automatically adjusts CPU performance based on >>>>>>> workload >>>>>>> demands using Energy Performance Preference (EPP) hints. >>>>>>> >>>>>>> When auto_sel_mode=1: >>>>>>> - Configure all CPUs for autonomous operation on first init >>>>>>> - Set EPP to performance preference (0x0) >>>>>>> - Use HW min/max when set; otherwise program from policy limits >>>>>>> (caps) >>>>>>> - Clamp desired_perf to bounds before enabling autonomous mode >>>>>>> - Hardware controls frequency instead of the OS governor >>>>>>> >>>>>>> The boot parameter is applied only during first policy >>>>>>> initialization. >>>>>>> On hotplug, skip applying it so that the user's runtime sysfs >>>>>>> configuration is preserved. >>>>>>> >>>>>>> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation) >>>>>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >>>>>>> --- >>>>>>> Part 1 [1] of this series was applied for 7.1 and present in next. >>>>>>> Sending this patch as reworked version of 'patch 11' from [2] based >>>>>>> on next. >>>>>>> >>>>>>> [1] >>>>>>> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/ >>>>>>> >>>>>>> >>>>>>> [2] >>>>>>> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/ >>>>>>> >>>>>>> >>>>>>> --- >>>>>>> .../admin-guide/kernel-parameters.txt | 13 +++ >>>>>>> drivers/cpufreq/cppc_cpufreq.c | 84 >>>>>>> +++++++++++++++++-- >>>>>>> 2 files changed, 92 insertions(+), 5 deletions(-) >>>>>>> >>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt >>>>>>> b/Documentation/admin-guide/kernel-parameters.txt >>>>>>> index fa6171b5fdd5..de4b4c89edfe 100644 >>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt >>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt >>>>>>> @@ -1060,6 +1060,19 @@ Kernel parameters >>>>>>> policy to use. This governor must be >>>>>>> registered in the >>>>>>> kernel before the cpufreq driver probes. >>>>>>> >>>>>>> + cppc_cpufreq.auto_sel_mode= >>>>>>> + [CPU_FREQ] Enable ACPI CPPC autonomous >>>>>>> performance >>>>>>> + selection. When enabled, hardware >>>>>>> automatically adjusts >>>>>>> + CPU frequency on all CPUs based on workload >>>>>>> demands. >>>>>>> + In Autonomous mode, Energy Performance >>>>>>> Preference (EPP) >>>>>>> + hints guide hardware toward performance (0x0) >>>>>>> or energy >>>>>>> + efficiency (0xff). >>>>>>> + Requires ACPI CPPC autonomous selection >>>>>>> register support. >>>>>>> + Format: <bool> >>>>>>> + Default: 0 (disabled) >>>>>>> + 0: use cpufreq governors >>>>>>> + 1: enable if supported by hardware >>>>>>> + >>>>>>> cpu_init_udelay=N >>>>>>> [X86,EARLY] Delay for N microsec between >>>>>>> assert and de-assert >>>>>>> of APIC INIT to start processors. This delay >>>>>>> occurs >>>>>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c >>>>>>> b/drivers/cpufreq/cppc_cpufreq.c >>>>>>> index 5dfb109cf1f4..49c148b2a0a4 100644 >>>>>>> --- a/drivers/cpufreq/cppc_cpufreq.c >>>>>>> +++ b/drivers/cpufreq/cppc_cpufreq.c >>>>>>> @@ -28,6 +28,9 @@ >>>>>>> >>>>>>> static struct cpufreq_driver cppc_cpufreq_driver; >>>>>>> >>>>>>> +/* Autonomous Selection boot parameter */ >>>>>>> +static bool auto_sel_mode; >>>>>>> + >>>>>>> #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE >>>>>>> static enum { >>>>>>> FIE_UNSET = -1, >>>>>>> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct >>>>>>> cpufreq_policy *policy) >>>>>>> policy->cur = cppc_perf_to_khz(caps, caps->highest_perf); >>>>>>> cpu_data->perf_ctrls.desired_perf = caps->highest_perf; >>>>>>> >>>>>>> - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls); >>>>>>> - if (ret) { >>>>>>> - pr_debug("Err setting perf value:%d on CPU:%d. >>>>>>> ret:%d\n", >>>>>>> - caps->highest_perf, cpu, ret); >>>>>>> - goto out; >>>>>>> + /* >>>>>>> + * Enable autonomous mode on first init if boot param is set. >>>>>>> + * Check last_governor to detect first init and skip if >>>>>>> auto_sel >>>>>>> + * is already enabled. >>>>>>> + */ >>>>>> If the goal is to set autosel only once at the driver init, >>>>>> shouldn't this be done in cppc_cpufreq_init() ? >>>>>> I understand that cpu_data doesn't exist yet in >>>>>> cppc_cpufreq_init(), but this seems more appropriate to do >>>>>> it there IMO. >>>>>> >>>>>> This means the cpudata should be updated accordingly >>>>>> in this cppc_cpufreq_cpu_init() function. >>>>> In an earlier version [1], the setup was in cppc_cpufreq_init() but >>>>> was moved to cppc_cpufreq_cpu_init() to improve per-CPU error >>>>> handling. >>>>> Keeping the setup in cppc_cpufreq_init() helps to avoid the >>>>> last_governor >>>>> check. We can warn for a CPU failing to enable and continue so other >>>>> CPUs keep autonomous mode. >>>>> cppc_cpufreq_cpu_init() would then just check the auto_sel state >>>>> from register and sync policy limits from min/max_perf registers when >>>>> autonomous mode is active. >>>>> Please let me know your thoughts. >>>> FWIU the auto_sel_mode module parameter allows to >>>> configure the default auto_sel_mode when the driver is >>>> first loaded, so there should not need to check that again >>>> whenever cppc_cpufreq_cpu_init() is called. >>>> Maybe Ionela saw something we didn't see ? >>> >>> AFAIU, the concern in that review [1] was about error handling as the >>> earlier version disabled auto_sel on all CPUs if any single CPU failed. >>> Per-CPU error handling in cppc_cpufreq_init() (warn and continue) >>> addresses that. Can't think of more reason. >>> Do you have anything in mind? >>> >> >> Actually, one case where cppc_cpufreq_cpu_init() would be needed >> is when CPUs are offline at boot. So I will keep the setup in >> cppc_cpufreq_cpu_init() in v2 same as present in current version. >> > Wouldn't it be possible to loop over the "cpu_present_mask" > as you currently do in cppc_cpufreq_exit() ? On ARM64 it works since registers go through PCC/SystemMemory which don't require the target CPU online. But cppc_cpufreq.c is also built for RISCV, where cpc_write_ffh() uses smp_call_function_single(). So setup in cppc_cpufreq_init() with for_each_present_cpu() would fail on RISCV+FFH platforms when CPUs are offline at boot. cppc_cpufreq_cpu_init() handles all cases naturally. > > ------ > > Another issue about relying on "cpu_data->perf_ctrls.auto_sel" in: > > """ > if (auto_sel_mode && policy->last_governor[0] == '\0' && > !cpu_data->perf_ctrls.auto_sel) { > """ > > is that the cpu_data struct is fresh memory coming from > cppc_cpufreq_get_cpu_data(), so it might always be 0 > I think. > cppc_cpufreq_get_cpu_data() calls cppc_get_perf() (added in [1]) which reads perf_ctrls including auto_sel from the HW register. So, cpu_data->perf_ctrls.auto_sel reflects the actual HW state, not the zeroed kzalloc value. [1] https://lore.kernel.org/lkml/20260206142658.72583-2-sumitg@nvidia.com/ Thank you, Sumit Gupta ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-04-24 13:52 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-17 15:10 [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support Sumit Gupta 2026-03-24 18:18 ` Pierre Gondois 2026-04-06 18:08 ` Sumit Gupta 2026-04-10 13:47 ` Pierre Gondois 2026-04-13 5:51 ` Viresh Kumar 2026-04-20 13:13 ` Sumit Gupta 2026-04-20 13:07 ` Sumit Gupta 2026-04-24 12:10 ` Sumit Gupta 2026-04-24 12:55 ` Pierre Gondois 2026-04-24 13:52 ` Sumit Gupta
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox