* Re: [PATCH] cpufreq: CPPC: add autonomous mode boot parameter support
From: Pierre Gondois @ 2026-04-10 13:47 UTC (permalink / raw)
To: Sumit Gupta
Cc: linux-tegra, linux-kernel, linux-doc, zhenglifeng1, treding,
viresh.kumar, jonathanh, vsethi, ionela.voinescu, ksitaraman,
sanjayc, zhanjie9, corbet, mochs, skhan, bbasu, rdunlap, linux-pm,
mario.limonciello, rafael
In-Reply-To: <b8debb30-67a5-4d2b-8c08-8fd287f7258e@nvidia.com>
Hello Sumit,
On 4/6/26 20:08, Sumit Gupta wrote:
> Hi Pierre,
>
> Thank you for the comments.
> Sorry for late reply as I was on vacation.
>
No worries
>
> On 24/03/26 23:48, Pierre Gondois wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> Hello Sumit,
>>
>> On 3/17/26 16:10, Sumit Gupta wrote:
>>> Add kernel boot parameter 'cppc_cpufreq.auto_sel_mode' to enable CPPC
>>> autonomous performance selection on all CPUs at system startup without
>>> requiring runtime sysfs manipulation. When autonomous mode is enabled,
>>> the hardware automatically adjusts CPU performance based on workload
>>> demands using Energy Performance Preference (EPP) hints.
>>>
>>> When auto_sel_mode=1:
>>> - Configure all CPUs for autonomous operation on first init
>>> - Set EPP to performance preference (0x0)
>>> - Use HW min/max when set; otherwise program from policy limits (caps)
>>> - Clamp desired_perf to bounds before enabling autonomous mode
>>> - Hardware controls frequency instead of the OS governor
>>>
>>> The boot parameter is applied only during first policy initialization.
>>> On hotplug, skip applying it so that the user's runtime sysfs
>>> configuration is preserved.
>>>
>>> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> (Documentation)
>>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
>>> ---
>>> Part 1 [1] of this series was applied for 7.1 and present in next.
>>> Sending this patch as reworked version of 'patch 11' from [2] based
>>> on next.
>>>
>>> [1]
>>> https://lore.kernel.org/lkml/20260206142658.72583-1-sumitg@nvidia.com/
>>> [2]
>>> https://lore.kernel.org/lkml/20251223121307.711773-1-sumitg@nvidia.com/
>>> ---
>>> .../admin-guide/kernel-parameters.txt | 13 +++
>>> drivers/cpufreq/cppc_cpufreq.c | 84
>>> +++++++++++++++++--
>>> 2 files changed, 92 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>>> b/Documentation/admin-guide/kernel-parameters.txt
>>> index fa6171b5fdd5..de4b4c89edfe 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -1060,6 +1060,19 @@ Kernel parameters
>>> policy to use. This governor must be
>>> registered in the
>>> kernel before the cpufreq driver probes.
>>>
>>> + cppc_cpufreq.auto_sel_mode=
>>> + [CPU_FREQ] Enable ACPI CPPC autonomous
>>> performance
>>> + selection. When enabled, hardware
>>> automatically adjusts
>>> + CPU frequency on all CPUs based on workload
>>> demands.
>>> + In Autonomous mode, Energy Performance
>>> Preference (EPP)
>>> + hints guide hardware toward performance (0x0)
>>> or energy
>>> + efficiency (0xff).
>>> + Requires ACPI CPPC autonomous selection
>>> register support.
>>> + Format: <bool>
>>> + Default: 0 (disabled)
>>> + 0: use cpufreq governors
>>> + 1: enable if supported by hardware
>>> +
>>> cpu_init_udelay=N
>>> [X86,EARLY] Delay for N microsec between
>>> assert and de-assert
>>> of APIC INIT to start processors. This delay
>>> occurs
>>> diff --git a/drivers/cpufreq/cppc_cpufreq.c
>>> b/drivers/cpufreq/cppc_cpufreq.c
>>> index 5dfb109cf1f4..49c148b2a0a4 100644
>>> --- a/drivers/cpufreq/cppc_cpufreq.c
>>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>>> @@ -28,6 +28,9 @@
>>>
>>> static struct cpufreq_driver cppc_cpufreq_driver;
>>>
>>> +/* Autonomous Selection boot parameter */
>>> +static bool auto_sel_mode;
>>> +
>>> #ifdef CONFIG_ACPI_CPPC_CPUFREQ_FIE
>>> static enum {
>>> FIE_UNSET = -1,
>>> @@ -708,11 +711,74 @@ static int cppc_cpufreq_cpu_init(struct
>>> cpufreq_policy *policy)
>>> policy->cur = cppc_perf_to_khz(caps, caps->highest_perf);
>>> cpu_data->perf_ctrls.desired_perf = caps->highest_perf;
>>>
>>> - ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>> - if (ret) {
>>> - pr_debug("Err setting perf value:%d on CPU:%d. ret:%d\n",
>>> - caps->highest_perf, cpu, ret);
>>> - goto out;
>>> + /*
>>> + * Enable autonomous mode on first init if boot param is set.
>>> + * Check last_governor to detect first init and skip if auto_sel
>>> + * is already enabled.
>>> + */
>> If the goal is to set autosel only once at the driver init,
>> shouldn't this be done in cppc_cpufreq_init() ?
>> I understand that cpu_data doesn't exist yet in
>> cppc_cpufreq_init(), but this seems more appropriate to do
>> it there IMO.
>>
>> This means the cpudata should be updated accordingly
>> in this cppc_cpufreq_cpu_init() function.
>
> In an earlier version [1], the setup was in cppc_cpufreq_init() but
> was moved to cppc_cpufreq_cpu_init() to improve per-CPU error handling.
> Keeping the setup in cppc_cpufreq_init() helps to avoid the last_governor
> check. We can warn for a CPU failing to enable and continue so other
> CPUs keep autonomous mode.
> cppc_cpufreq_cpu_init() would then just check the auto_sel state
> from register and sync policy limits from min/max_perf registers when
> autonomous mode is active.
> Please let me know your thoughts.
FWIU the auto_sel_mode module parameter allows to
configure the default auto_sel_mode when the driver is
first loaded, so there should not need to check that again
whenever cppc_cpufreq_cpu_init() is called.
Maybe Ionela saw something we didn't see ?
Also just to be sure, should it still be possible to change
the auto_sel_mode through the sysfs if the driver was
loaded with auto_sel_mode=1 ?
>
> [1]
> https://lore.kernel.org/lkml/5593d364-ca37-41c5-b33f-f7e245d6d626@nvidia.com/
>
>
>>
>>> + if (auto_sel_mode && policy->last_governor[0] == '\0' &&
>>> + !cpu_data->perf_ctrls.auto_sel) {
>>> + /* Enable CPPC - optional register, some platforms
>>> need it */
>> The documentation of the CPPC Enable Register is subject to
>> interpretation, but IIUC the field should be set to use the CPPC
>> controls, so I assume this should be set in cppc_cpufreq_init()
>> instead ?
>
> Agree that the CPPC Enable is about using the CPPC control path
> in general and not only for autonomous selection.
> Will move cppc_set_enable() into cppc_cpufreq_init() or outside the
> autonomous mode block in cppc_cpufreq_cpu_init() as per conclusion
> of previous comment.
>
>>> + ret = cppc_set_enable(cpu, true);
>>> + if (ret && ret != -EOPNOTSUPP)
>>> + pr_warn("Failed to enable CPPC for CPU%d
>>> (%d)\n", cpu, ret);
>>> +
>>> + /*
>>> + * Prefer HW min/max_perf when set; otherwise program
>>> from
>>> + * policy limits derived earlier from caps.
>>> + * Clamp desired_perf to bounds and sync policy->cur.
>>> + */
>>> + if (!cpu_data->perf_ctrls.min_perf ||
>>> !cpu_data->perf_ctrls.max_perf)
>>
>> The function doesn't seem to exist.
>
> It is newly added in [2].
> Don't need to call it if we move the setup to cppc_cpufreq_init().
Ah ok right thanks.
>
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=ea3db45ae476889a1ba0ab3617e6afdeeefbda3d
>
>
>
>>
>>> + cppc_cpufreq_update_perf_limits(cpu_data, policy);
>>> +
>>> + cpu_data->perf_ctrls.desired_perf =
>>> + clamp_t(u32, cpu_data->perf_ctrls.desired_perf,
>>> + cpu_data->perf_ctrls.min_perf,
>>> + cpu_data->perf_ctrls.max_perf);
>>> +
>>> + policy->cur = cppc_perf_to_khz(caps,
>>> + cpu_data->perf_ctrls.desired_perf);
>>> +
>>
>> Maybe this should also be done in cppc_cpufreq_init()
>> if the auto_sel_mode parameter is set ?
>
> Yes.
>
>>
>>> + /* EPP is optional - some platforms may not support it */
>>> + ret = cppc_set_epp(cpu, CPPC_EPP_PERFORMANCE_PREF);
>>> + if (ret && ret != -EOPNOTSUPP)
>>> + pr_warn("Failed to set EPP for CPU%d (%d)\n",
>>> cpu, ret);
>>> + else if (!ret)
>>> + cpu_data->perf_ctrls.energy_perf =
>>> CPPC_EPP_PERFORMANCE_PREF;
>>> +
>>> + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>> + if (ret) {
>>> + pr_debug("Err setting perf for autonomous mode
>>> CPU:%d ret:%d\n",
>>> + cpu, ret);
>>> + goto out;
>>> + }
>>> +
>>> + ret = cppc_set_auto_sel(cpu, true);
>>> + if (ret && ret != -EOPNOTSUPP) {
>>> + pr_warn("Failed autonomous config for CPU%d
>>> (%d)\n",
>>> + cpu, ret);
>>> + goto out;
>>> + }
>>> + if (!ret)
>>> + cpu_data->perf_ctrls.auto_sel = true;
>>> + }
>>> +
>>> + if (cpu_data->perf_ctrls.auto_sel) {
>>
>> There is a patchset ongoing which tries to remove
>> setting policy->min/max from driver initialization.
>> Indeed, these values are only temporarily valid,
>> until the governor override them.
>> It is not sure yet the patch will be accepted though.
>>
>> https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/
>>
>
>
> You are right that policy->min/max from .init() are temporary today
> as cpufreq_set_policy() overwrites them before the governor starts.
>
> On my test platform (highest == nominal, lowest_nonlinear == lowest),
> this had no visible effect because the BIOS bounds and cpuinfo range
> end up identical. But on platforms where they differ, the governor
> would widen the range to full cpuinfo limits.
>
> I think your patch [3] fixes this by giving these the right semantic as
> initial QoS requests. With it, cpufreq_set_policy() preserves the policy
> limits set from min/max_perf registers in .init(), which can either be
> BIOS values on first boot or last user configured values before hotplug.
>
> I will update the comment in v2 to reflect QoS seeding intent.
>
> I see that the first two patches of your series [3] is applied for 7.1.
> Do you plan to send the pending patch (3/4) from [3]?
>
I need to ping Viresh to check if this is still relevant.
> [3]
> https://lore.kernel.org/lkml/20260317101753.2284763-4-pierre.gondois@arm.com/
>
>
>>
>>
>>> + /* Sync policy limits from HW when autonomous mode is
>>> active */
>>> + policy->min = cppc_perf_to_khz(caps,
>>> + cpu_data->perf_ctrls.min_perf ?:
>>> + caps->lowest_nonlinear_perf);
>>> + policy->max = cppc_perf_to_khz(caps,
>>> + cpu_data->perf_ctrls.max_perf ?:
>>> + caps->nominal_perf);
>>> + } else {
>>> + /* Normal mode: governors control frequency */
>>> + ret = cppc_set_perf(cpu, &cpu_data->perf_ctrls);
>>> + if (ret) {
>>> + pr_debug("Err setting perf value:%d on CPU:%d.
>>> ret:%d\n",
>>> + caps->highest_perf, cpu, ret);
>>> + goto out;
>>> + }
>>> }
>>>
>>> cppc_cpufreq_cpu_fie_init(policy);
>>> @@ -1038,10 +1104,18 @@ static int __init cppc_cpufreq_init(void)
>>>
>>> static void __exit cppc_cpufreq_exit(void)
>>> {
>>> + unsigned int cpu;
>>> +
>>> + for_each_present_cpu(cpu)
>>> + cppc_set_auto_sel(cpu, false);
>>
>> If the firmware has a default EPP value, it means that loading
>> and the unloading the driver will reset this default EPP value.
>> Maybe the initial EPP value and/or the auto_sel value should be
>> cached somewhere and restored on exit ?
>> I don't know if this is actually an issue, this is just to signal it.
>
> The auto_sel_mode boot path programs EPP to performance preference(0),
> not the firmware’s previous value. On unload we only call
> cppc_set_auto_sel(false); we do not restore EPP, min/max perf,
> or other CPPC fields to firmware defaults.
Yes right, so loading/unloading the driver might change the
default EPP value.
>
> Thank you,
> Sumit Gupta
>
> ....
>
>
^ permalink raw reply
* Re: [patch V2 08/11] fs/timerfd: Use the new alarm/hrtimer functions
From: Frederic Weisbecker @ 2026-04-10 13:46 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Alexander Viro, Christian Brauner, Jan Kara,
Anna-Maria Behnsen, linux-fsdevel, Calvin Owens,
Peter Zijlstra (Intel), John Stultz, Stephen Boyd,
Sebastian Reichel, linux-pm, Pablo Neira Ayuso, Florian Westphal,
Phil Sutter, netfilter-devel, coreteam
In-Reply-To: <20260408114952.469141112@kernel.org>
Le Wed, Apr 08, 2026 at 01:54:20PM +0200, Thomas Gleixner a écrit :
> Like any other user controlled interface, timerfd based timers can be
> programmed with expiry times in the past or vary small intervals.
>
> Both hrtimer and alarmtimer provide new interfaces which return the queued
> state of the timer. If the timer was already expired, then let the callsite
> handle the timerfd context update so that the full round trip through the
> hrtimer interrupt is avoided.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> Cc: Christian Brauner <brauner@kernel.org>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply
* [GIT PULL] turbostat fixes for 7.0
From: Len Brown @ 2026-04-10 13:40 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux PM list, Linux Kernel Mailing List
Hi Linus,
Please pull these turbostat-fixes-for-7.0 patches.
thanks!
Len Brown, Intel Open Source Technology Center
The following changes since commit 1f318b96cc84d7c2ab792fcc0bfd42a7ca890681:
Linux 7.0-rc3 (2026-03-08 16:56:54 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git
tags/turbostat-fixes-for-7.0
for you to fetch changes up to ba893caead54745595e29953f0531cf3651610aa:
tools/power turbostat: Allow execution to continue after
perf_l2_init() failure (2026-04-10 09:04:32 -0400)
----------------------------------------------------------------
Turbostat Fixes
Fix a memory allocation issue that could corrupt output values or SEGV
Fix a perf initilization issue that could exit on some HW + kernels.
Minor fixes.
----------------------------------------------------------------
Artem Bityutskiy (4):
tools/power turbostat: Consistently use print_float_value()
tools/power turbostat: Fix incorrect format variable
tools/power turbostat: Fix --show/--hide for individual cpuidle counters
tools/power turbostat: Fix delimiter bug in print functions
David Arcari (1):
tools/power turbostat: Allow execution to continue after
perf_l2_init() failure
Len Brown (1):
tools/power turbostat: Fix swidle header vs data display
Serhii Pievniev (1):
tools/power/turbostat: Fix microcode patch level output for AMD/Hygon
Zhang Rui (2):
tools/power turbostat: Fix illegal memory access when SMT is
present and disabled
tools/power turbostat: Eliminate unnecessary data structure allocation
tools/power/x86/turbostat/turbostat.c | 100 ++++++++++++++++++----------------
1 file changed, 54 insertions(+), 46 deletions(-)
^ permalink raw reply
* [PATCH 9/9] tools/power turbostat: Allow execution to continue after perf_l2_init() failure
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: David Arcari, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>
From: David Arcari <darcari@redhat.com>
Currently, if perf_l2_init() fails turbostat exits after issuing the
following error (which was encountered on AlderLake):
turbostat: perf_l2_init(cpu0, 0x0, 0xff24) REFS: Invalid argument
This occurs because perf_l2_init() calls err(). However, the code has been
written in such a manner that it is able to perform cleanup and continue.
Therefore, this issue can be addressed by changing the appropriate calls
to err() to warnx().
Additionally, correct the PMU type arguments passed to the warning strings
in the ecore and lcore blocks so the logs accurately reflect the failing
counter type.
Signed-off-by: David Arcari <darcari@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 34e2143cd4b3..e9e8ef72395a 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -9405,13 +9405,13 @@ void perf_l2_init(void)
if (!is_hybrid) {
fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.uniform, perf_model_support->first.refs, -1, PERF_FORMAT_GROUP);
if (fd_l2_percpu[cpu] == -1) {
- err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.refs);
+ warnx("%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.refs);
free_fd_l2_percpu();
return;
}
retval = open_perf_counter(cpu, perf_pmu_types.uniform, perf_model_support->first.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP);
if (retval == -1) {
- err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.hits);
+ warnx("%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.uniform, perf_model_support->first.hits);
free_fd_l2_percpu();
return;
}
@@ -9420,39 +9420,39 @@ void perf_l2_init(void)
if (perf_pcore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_pcore_set)) {
fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.pcore, perf_model_support->first.refs, -1, PERF_FORMAT_GROUP);
if (fd_l2_percpu[cpu] == -1) {
- err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.refs);
+ warnx("%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.refs);
free_fd_l2_percpu();
return;
}
retval = open_perf_counter(cpu, perf_pmu_types.pcore, perf_model_support->first.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP);
if (retval == -1) {
- err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.hits);
+ warnx("%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->first.hits);
free_fd_l2_percpu();
return;
}
} else if (perf_ecore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_ecore_set)) {
fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.ecore, perf_model_support->second.refs, -1, PERF_FORMAT_GROUP);
if (fd_l2_percpu[cpu] == -1) {
- err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->second.refs);
+ warnx("%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.ecore, perf_model_support->second.refs);
free_fd_l2_percpu();
return;
}
retval = open_perf_counter(cpu, perf_pmu_types.ecore, perf_model_support->second.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP);
if (retval == -1) {
- err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->second.hits);
+ warnx("%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.ecore, perf_model_support->second.hits);
free_fd_l2_percpu();
return;
}
} else if (perf_lcore_set && CPU_ISSET_S(cpu, cpu_possible_setsize, perf_lcore_set)) {
fd_l2_percpu[cpu] = open_perf_counter(cpu, perf_pmu_types.lcore, perf_model_support->third.refs, -1, PERF_FORMAT_GROUP);
if (fd_l2_percpu[cpu] == -1) {
- err(-1, "%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->third.refs);
+ warnx("%s(cpu%d, 0x%x, 0x%llx) REFS", __func__, cpu, perf_pmu_types.lcore, perf_model_support->third.refs);
free_fd_l2_percpu();
return;
}
retval = open_perf_counter(cpu, perf_pmu_types.lcore, perf_model_support->third.hits, fd_l2_percpu[cpu], PERF_FORMAT_GROUP);
if (retval == -1) {
- err(-1, "%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.pcore, perf_model_support->third.hits);
+ warnx("%s(cpu%d, 0x%x, 0x%llx) HITS", __func__, cpu, perf_pmu_types.lcore, perf_model_support->third.hits);
free_fd_l2_percpu();
return;
}
--
2.45.2
^ permalink raw reply related
* [PATCH 8/9] tools/power turbostat: Fix delimiter bug in print functions
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: Artem Bityutskiy, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>
From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Commands that add counters, such as 'turbostat --show C1,C1+'
display merged columns without a delimiter.
This is caused by the bad syntax: '(*printed++ ? delim : "")', shared by
print_name()/print_hex_value()/print_decimal_value()/print_float_value()
Use '((*printed)++ ? delim : "")' to correctly increment the value at *printed.
[lenb: fix code and commit message typo, re-word]
Fixes: 56dbb878507b ("tools/power turbostat: Refactor added column header printing")
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 3487548841e1..34e2143cd4b3 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2837,29 +2837,29 @@ static inline int print_name(int width, int *printed, char *delim, char *name, e
UNUSED(type);
if (format == FORMAT_RAW && width >= 64)
- return (sprintf(outp, "%s%-8s", (*printed++ ? delim : ""), name));
+ return (sprintf(outp, "%s%-8s", ((*printed)++ ? delim : ""), name));
else
- return (sprintf(outp, "%s%s", (*printed++ ? delim : ""), name));
+ return (sprintf(outp, "%s%s", ((*printed)++ ? delim : ""), name));
}
static inline int print_hex_value(int width, int *printed, char *delim, unsigned long long value)
{
if (width <= 32)
- return (sprintf(outp, "%s%08x", (*printed++ ? delim : ""), (unsigned int)value));
+ return (sprintf(outp, "%s%08x", ((*printed)++ ? delim : ""), (unsigned int)value));
else
- return (sprintf(outp, "%s%016llx", (*printed++ ? delim : ""), value));
+ return (sprintf(outp, "%s%016llx", ((*printed)++ ? delim : ""), value));
}
static inline int print_decimal_value(int width, int *printed, char *delim, unsigned long long value)
{
UNUSED(width);
- return (sprintf(outp, "%s%lld", (*printed++ ? delim : ""), value));
+ return (sprintf(outp, "%s%lld", ((*printed)++ ? delim : ""), value));
}
static inline int print_float_value(int *printed, char *delim, double value)
{
- return (sprintf(outp, "%s%0.2f", (*printed++ ? delim : ""), value));
+ return (sprintf(outp, "%s%0.2f", ((*printed)++ ? delim : ""), value));
}
void print_header(char *delim)
--
2.45.2
^ permalink raw reply related
* [PATCH 7/9] tools/power turbostat: Fix --show/--hide for individual cpuidle counters
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: Artem Bityutskiy, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>
From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Problem: individual swidle counter names (C1, C1+, C1-, etc.) cannot be
selected via --show/--hide due to two bugs in probe_cpuidle_counts():
1. The function returns immediately when BIC_cpuidle is not enabled,
without checking deferred_add_index.
2. The deferred name check runs against name_buf before the trailing
newline is stripped, so is_deferred_add("C1\n") never matches "C1".
Fix:
1. Relax the early return to pass through when deferred names are
queued.
2. Strip the trailing newline from name_buf before performing deferred
name checks.
3. Check each suffixed variant (C1+, C1, C1-) individually so that
e.g. "--show C1+" enables only the requested metric.
In addition, introduce a helper function to avoid repeating the
condition (readability cleanup).
Fixes: ec4acd3166d8 ("tools/power turbostat: disable "cpuidle" invocation counters, by default")
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 35 ++++++++++++++++-----------
1 file changed, 21 insertions(+), 14 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 4d954533c71d..3487548841e1 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -11285,6 +11285,14 @@ void probe_cpuidle_residency(void)
}
}
+static bool cpuidle_counter_wanted(char *name)
+{
+ if (is_deferred_skip(name))
+ return false;
+
+ return DO_BIC(BIC_cpuidle) || is_deferred_add(name);
+}
+
void probe_cpuidle_counts(void)
{
char path[64];
@@ -11294,7 +11302,7 @@ void probe_cpuidle_counts(void)
int min_state = 1024, max_state = 0;
char *sp;
- if (!DO_BIC(BIC_cpuidle))
+ if (!DO_BIC(BIC_cpuidle) && !deferred_add_index)
return;
for (state = 10; state >= 0; --state) {
@@ -11309,12 +11317,6 @@ void probe_cpuidle_counts(void)
remove_underbar(name_buf);
- if (!DO_BIC(BIC_cpuidle) && !is_deferred_add(name_buf))
- continue;
-
- if (is_deferred_skip(name_buf))
- continue;
-
/* truncate "C1-HSW\n" to "C1", or truncate "C1\n" to "C1" */
sp = strchr(name_buf, '-');
if (!sp)
@@ -11329,16 +11331,19 @@ void probe_cpuidle_counts(void)
* Add 'C1+' for C1, and so on. The 'below' sysfs file always contains 0 for
* the last state, so do not add it.
*/
-
*sp = '+';
*(sp + 1) = '\0';
- sprintf(path, "cpuidle/state%d/below", state);
- add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+ if (cpuidle_counter_wanted(name_buf)) {
+ sprintf(path, "cpuidle/state%d/below", state);
+ add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+ }
}
*sp = '\0';
- sprintf(path, "cpuidle/state%d/usage", state);
- add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+ if (cpuidle_counter_wanted(name_buf)) {
+ sprintf(path, "cpuidle/state%d/usage", state);
+ add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+ }
/*
* The 'above' sysfs file always contains 0 for the shallowest state (smallest
@@ -11347,8 +11352,10 @@ void probe_cpuidle_counts(void)
if (state != min_state) {
*sp = '-';
*(sp + 1) = '\0';
- sprintf(path, "cpuidle/state%d/above", state);
- add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+ if (cpuidle_counter_wanted(name_buf)) {
+ sprintf(path, "cpuidle/state%d/above", state);
+ add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_ITEMS, FORMAT_DELTA, SYSFS_PERCPU, 0);
+ }
}
}
}
--
2.45.2
^ permalink raw reply related
* [PATCH 6/9] tools/power turbostat: Fix incorrect format variable
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: Artem Bityutskiy, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>
From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
In the perf thread, core, and package counter loops, an incorrect
'mp->format' variable is used instead of 'pp->format'.
[lenb: edit commit message]
Fixes: 696d15cbd8c2 ("tools/power turbostat: Refactor floating point printout code")
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 9744f9caac9a..4d954533c71d 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -3468,7 +3468,7 @@ int format_counters(PER_THREAD_PARAMS)
for (i = 0, pp = sys.perf_tp; pp; ++i, pp = pp->next) {
if (pp->format == FORMAT_RAW)
outp += print_hex_value(pp->width, &printed, delim, t->perf_counter[i]);
- else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE)
+ else if (pp->format == FORMAT_DELTA || pp->format == FORMAT_AVERAGE)
outp += print_decimal_value(pp->width, &printed, delim, t->perf_counter[i]);
else if (pp->format == FORMAT_PERCENT) {
if (pp->type == COUNTER_USEC)
@@ -3538,7 +3538,7 @@ int format_counters(PER_THREAD_PARAMS)
for (i = 0, pp = sys.perf_cp; pp; i++, pp = pp->next) {
if (pp->format == FORMAT_RAW)
outp += print_hex_value(pp->width, &printed, delim, c->perf_counter[i]);
- else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE)
+ else if (pp->format == FORMAT_DELTA || pp->format == FORMAT_AVERAGE)
outp += print_decimal_value(pp->width, &printed, delim, c->perf_counter[i]);
else if (pp->format == FORMAT_PERCENT)
outp += print_float_value(&printed, delim, pct(c->perf_counter[i], tsc));
@@ -3694,7 +3694,7 @@ int format_counters(PER_THREAD_PARAMS)
outp += print_hex_value(pp->width, &printed, delim, p->perf_counter[i]);
else if (pp->type == COUNTER_K2M)
outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), (unsigned int)p->perf_counter[i] / 1000);
- else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE)
+ else if (pp->format == FORMAT_DELTA || pp->format == FORMAT_AVERAGE)
outp += print_decimal_value(pp->width, &printed, delim, p->perf_counter[i]);
else if (pp->format == FORMAT_PERCENT)
outp += print_float_value(&printed, delim, pct(p->perf_counter[i], tsc));
--
2.45.2
^ permalink raw reply related
* [PATCH 5/9] tools/power turbostat: Consistently use print_float_value()
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: Artem Bityutskiy, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>
From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Fix the PMT thread code to use print_float_value(),
to be consistent with the PMT core and package code.
[lenb: commit message]
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index b985bce69142..9744f9caac9a 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -3489,12 +3489,12 @@ int format_counters(PER_THREAD_PARAMS)
case PMT_TYPE_XTAL_TIME:
value_converted = pct(value_raw / crystal_hz, interval_float);
- outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted);
+ outp += print_float_value(&printed, delim, value_converted);
break;
case PMT_TYPE_TCORE_CLOCK:
value_converted = pct(value_raw / tcore_clock_freq_hz, interval_float);
- outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted);
+ outp += print_float_value(&printed, delim, value_converted);
}
}
--
2.45.2
^ permalink raw reply related
* [PATCH 4/9] tools/power/turbostat: Fix microcode patch level output for AMD/Hygon
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: Serhii Pievniev, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>
From: Serhii Pievniev <spevnev16@gmail.com>
turbostat always used the same logic to read the microcode patch level,
which is correct for Intel but not for AMD/Hygon.
While Intel stores the patch level in the upper 32 bits of MSR, AMD
stores it in the lower 32 bits, which causes turbostat to report the
microcode version as 0x0 on AMD/Hygon.
Fix by shifting right by 32 for non-AMD/Hygon, preserving the existing
behavior for Intel and unknown vendors.
Fixes: 3e4048466c39 ("tools/power turbostat: Add --no-msr option")
Signed-off-by: Serhii Pievniev <spevnev16@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 14021a6ed717..b985bce69142 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -9121,10 +9121,13 @@ void process_cpuid()
cpuid_has_hv = ecx_flags & (1 << 31);
if (!no_msr) {
- if (get_msr(sched_getcpu(), MSR_IA32_UCODE_REV, &ucode_patch))
+ if (get_msr(sched_getcpu(), MSR_IA32_UCODE_REV, &ucode_patch)) {
warnx("get_msr(UCODE)");
- else
+ } else {
ucode_patch_valid = true;
+ if (!authentic_amd && !hygon_genuine)
+ ucode_patch >>= 32;
+ }
}
/*
@@ -9138,7 +9141,7 @@ void process_cpuid()
if (!quiet) {
fprintf(outf, "CPUID(1): family:model:stepping 0x%x:%x:%x (%d:%d:%d)", family, model, stepping, family, model, stepping);
if (ucode_patch_valid)
- fprintf(outf, " microcode 0x%x", (unsigned int)((ucode_patch >> 32) & 0xFFFFFFFF));
+ fprintf(outf, " microcode 0x%x", (unsigned int)ucode_patch);
fputc('\n', outf);
fprintf(outf, "CPUID(0x80000000): max_extended_levels: 0x%x\n", max_extended_level);
--
2.45.2
^ permalink raw reply related
* [PATCH 3/9] tools/power turbostat: Eliminate unnecessary data structure allocation
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: Zhang Rui, Len Brown
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>
From: Zhang Rui <rui.zhang@intel.com>
Linux core_id's are a per-package namespace, not a per-node namespace.
Rename topo.cores_per_node to topo.cores_per_pkg to reflect this.
Eliminate topo.nodes_per_pkg from the sizing for core data structures,
since it has no role except to unnecessarily bloat the allocation.
Validated on multiple Intel platforms (ICX/SPR/SRF/EMR/GNR/CWF) with
various CPU online/offline configurations and SMT enabled/disabled
scenarios.
No functional changes.
[lenb: commit message]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 791b9154f662..14021a6ed717 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2409,7 +2409,7 @@ struct topo_params {
int max_l3_id;
int max_node_num;
int nodes_per_pkg;
- int cores_per_node;
+ int cores_per_pkg;
int threads_per_core;
} topo;
@@ -9633,9 +9633,9 @@ void topology_probe(bool startup)
topo.max_core_id = max_core_id; /* within a package */
topo.max_package_id = max_package_id;
- topo.cores_per_node = max_core_id + 1;
+ topo.cores_per_pkg = max_core_id + 1;
if (debug > 1)
- fprintf(outf, "max_core_id %d, sizing for %d cores per package\n", max_core_id, topo.cores_per_node);
+ fprintf(outf, "max_core_id %d, sizing for %d cores per package\n", max_core_id, topo.cores_per_pkg);
if (!summary_only)
BIC_PRESENT(BIC_Core);
@@ -9700,7 +9700,7 @@ void allocate_counters_1(struct counters *counters)
void allocate_counters(struct counters *counters)
{
int i;
- int num_cores = topo.cores_per_node * topo.nodes_per_pkg * topo.num_packages;
+ int num_cores = topo.cores_per_pkg * topo.num_packages;
counters->threads = calloc(topo.max_cpu_num + 1, sizeof(struct thread_data));
if (counters->threads == NULL)
--
2.45.2
^ permalink raw reply related
* [PATCH 2/9] tools/power turbostat: Fix swidle header vs data display
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: Len Brown, Artem Bityutskiy
In-Reply-To: <57d2371d52be1d574b33382bfbf8052485b99d8b.1775827309.git.len.brown@intel.com>
From: Len Brown <len.brown@intel.com>
I changed my mind about displaying swidle statistics,
which are "added counters". Recently I reverted the
column headers to 8-columns, but kept print_decimal_value()
padding out to 16-columns for all 64-bit counters.
Simplify by keeping print_decimial_value() at %lld -- which
will often fit into 8-columns, and live with the fact
that it can overflow and shift the other columns,
which continue to tab-delimited.
This is a better compromise than inserting a bunch
of space characters that most users don't like.
Fixes: 1a23ba6a1ba2 ("tools/power turbostat: Print wide names only for RAW 64-bit columns")
Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index ae827485950d..791b9154f662 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -2852,10 +2852,9 @@ static inline int print_hex_value(int width, int *printed, char *delim, unsigned
static inline int print_decimal_value(int width, int *printed, char *delim, unsigned long long value)
{
- if (width <= 32)
- return (sprintf(outp, "%s%d", (*printed++ ? delim : ""), (unsigned int)value));
- else
- return (sprintf(outp, "%s%-8lld", (*printed++ ? delim : ""), value));
+ UNUSED(width);
+
+ return (sprintf(outp, "%s%lld", (*printed++ ? delim : ""), value));
}
static inline int print_float_value(int *printed, char *delim, double value)
--
2.45.2
^ permalink raw reply related
* [PATCH 1/9] tools/power turbostat: Fix illegal memory access when SMT is present and disabled
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm; +Cc: Zhang Rui, Len Brown
In-Reply-To: <20260410132836.398255-1-lenb@kernel.org>
From: Zhang Rui <rui.zhang@intel.com>
When SMT is present and disabled, turbostat may under-size
the thread_data array. This can corrupt results or
cause turbostat to exit with a segmentation fault.
[lenb: commit message]
Fixes: a2b4d0f8bf07 ("tools/power turbostat: Favor cpu# over core#")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
tools/power/x86/turbostat/turbostat.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 1a2671c28209..ae827485950d 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -9702,13 +9702,12 @@ void allocate_counters(struct counters *counters)
{
int i;
int num_cores = topo.cores_per_node * topo.nodes_per_pkg * topo.num_packages;
- int num_threads = topo.threads_per_core * num_cores;
- counters->threads = calloc(num_threads, sizeof(struct thread_data));
+ counters->threads = calloc(topo.max_cpu_num + 1, sizeof(struct thread_data));
if (counters->threads == NULL)
goto error;
- for (i = 0; i < num_threads; i++)
+ for (i = 0; i < topo.max_cpu_num + 1; i++)
(counters->threads)[i].cpu_id = -1;
counters->cores = calloc(num_cores, sizeof(struct core_data));
--
2.45.2
^ permalink raw reply related
* [PATCH 0/9] turbostat fixes for 7.0
From: Len Brown @ 2026-04-10 13:25 UTC (permalink / raw)
To: linux-pm
Please let me know if you seen any issues with these fixes.
thanks!
Len Brown, Intel Open Source Technology Center
Available here:
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git turbostat
----------------------------------------------------------------
Artem Bityutskiy (4):
tools/power turbostat: Consistently use print_float_value()
tools/power turbostat: Fix incorrect format variable
tools/power turbostat: Fix --show/--hide for individual cpuidle counters
tools/power turbostat: Fix delimiter bug in print functions
David Arcari (1):
tools/power turbostat: Allow execution to continue after perf_l2_init() failure
Len Brown (1):
tools/power turbostat: Fix swidle header vs data display
Serhii Pievniev (1):
tools/power/turbostat: Fix microcode patch level output for AMD/Hygon
Zhang Rui (2):
tools/power turbostat: Fix illegal memory access when SMT is present and disabled
tools/power turbostat: Eliminate unnecessary data structure allocation
tools/power/x86/turbostat/turbostat.c | 100 ++++++++++++++++++----------------
1 file changed, 54 insertions(+), 46 deletions(-)
^ permalink raw reply
* Re: [PATCH] tools/power turbostat: Allow execution to continue after perf_l2_init() failure
From: Len Brown @ 2026-04-10 12:59 UTC (permalink / raw)
To: David Arcari; +Cc: linux-pm, linux-kernel
In-Reply-To: <CAJvTdKmLA=n07hkkXLbwMex-_Byji9EOy7wkkFHrcPCj8tzMqw@mail.gmail.com>
Thank you for the patch, David, it is helpful.
I agree that turbostat should do its best to run properly when the
underlying kernel
doesn't have full support, and you found a configuration that I missed.
I'd like to understand how/why your kernel perf support is failing on alder lake
to be sure turbostat is coping the best it can.
If you can identify an upstream kernel version that fails this way,
that would be great.
You can poke with "perf stat" as well, but this will depend on what
.json counter list is compiled into
your version of perf.
probably a first sanity check would be if these commands for the LLC
and the L2 work:
sudo perf stat -e cache-misses sleep 1
sudo perf stat -e L2_REQUEST.ALL sleep 1
Also, with your L2 patch applied, does turbostat still successfully
show the LLC stats?
thanks,
Len Brown, Intel Open Source Technology Center
^ permalink raw reply
* Re: [patch V2 06/11] alarmtimer: Provide alarm_start_timer()
From: Frederic Weisbecker @ 2026-04-10 12:45 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, John Stultz, Stephen Boyd, Calvin Owens, Anna-Maria Behnsen,
Peter Zijlstra (Intel), Alexander Viro, Christian Brauner,
Jan Kara, linux-fsdevel, Sebastian Reichel, linux-pm,
Pablo Neira Ayuso, Florian Westphal, Phil Sutter, netfilter-devel,
coreteam
In-Reply-To: <20260408114952.332822525@kernel.org>
Le Wed, Apr 08, 2026 at 01:54:11PM +0200, Thomas Gleixner a écrit :
> Alarm timers utilize hrtimers for normal operation and only switch to the
> RTC on suspend. In order to catch already expired timers early and without
> going through a timer interrupt cycle, provide a new start function which
> internally uses hrtimer_start_range_ns_user().
>
> If hrtimer_start_range_ns_user() detects an already expired timer, it does
> not queue it. In that case remove the timer from the alarm base as well.
>
> Return the status queued or not back to the caller to handle the early
> expiry.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Acked-by: John Stultz <jstultz@google.com>
> Cc: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply
* Re: [patch V2 07/11] alarmtimer: Convert posix timer functions to alarm_start_timer()
From: Frederic Weisbecker @ 2026-04-10 12:37 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, John Stultz, Stephen Boyd, Calvin Owens, Anna-Maria Behnsen,
Peter Zijlstra (Intel), Alexander Viro, Christian Brauner,
Jan Kara, linux-fsdevel, Sebastian Reichel, linux-pm,
Pablo Neira Ayuso, Florian Westphal, Phil Sutter, netfilter-devel,
coreteam
In-Reply-To: <20260408114952.400451460@kernel.org>
Le Wed, Apr 08, 2026 at 01:54:16PM +0200, Thomas Gleixner a écrit :
> Use the new alarm_start_timer() for arming and rearming posix interval
> timers and for clock_nanosleep() so that already expired timers do not go
> through the full timer interrupt cycle.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Acked-by: John Stultz <jstultz@google.com>
> Cc: Stephen Boyd <sboyd@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply
* Re: [GIT PULL] thermal drivers changes for v7.1-rc1
From: Rafael J. Wysocki @ 2026-04-10 11:34 UTC (permalink / raw)
To: Daniel Lezcano
Cc: Rafael J. Wysocki, Linux Kernel Mailing List,
Linux PM mailing list, Alexander Stein, Thorsten Blum,
Richard Acayan, Manaf Meethalavalappu Pallikunhi,
Krzysztof Kozlowski, Gopi Krishna Menon, John Madieu
In-Reply-To: <298af5b4-f008-4a09-a6be-7c0652392ea6@oss.qualcomm.com>
Hi Daniel,
On Thu, Apr 9, 2026 at 10:18 PM Daniel Lezcano
<daniel.lezcano@oss.qualcomm.com> wrote:
>
> Hi Rafael,
>
> The following changes since commit 1f318b96cc84d7c2ab792fcc0bfd42a7ca890681:
>
> Linux 7.0-rc3 (2026-03-08 16:56:54 -0700)
>
> are available in the Git repository at:
>
>
> ssh://git@gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git
> tags/thermal-v7.1-rc1
>
> for you to fetch changes up to bf746e2a41efd98668c97759e06d436ae5af5a82:
>
> thermal: renesas: rzg3e: Remove stale @trim_offset kernel-doc entry
> (2026-04-09 21:47:15 +0200)
>
> ----------------------------------------------------------------
> - Added an OF node address to output message to make sensor names more
> distinguishable (Alexander Stein)
>
> - Added hwmon support for the i.MX97 thermal sensor (Alexander Stein)
>
> - Clamped correctly the results when doing value/temperature conversion
> in the Spreadtrum driver (Thorsten Blum)
>
> - Added the SDM670 compatible DT bindings for the Tsens and the lMH
> drivers (Richard Acayan)
>
> - Added the SM8750 compatible DT bindings for the Tsens (Manaf
> Meethalavalappu Pallikunhi)
>
> - Added the Eliza SoC compatible DT bindings for the Tsens (Krzysztof
> Kozlowski)
>
> - Fixed inverted condition check on error in the Spear driver (Gopi
> Krishna Menon)
>
> - Converted the DT bindings documentation into DT schema (Gopi Krishna
> Menon)
>
> - Used max() macro to increase readibility in the Broadcom STB thermal
> sensor (Thorsten Blum)
>
> - Removed stale @trim_offset kernel-doc entry (John Madieu)
>
> ----------------------------------------------------------------
> Alexander Stein (2):
> thermal/of: Add OF node address to output message
> thermal/drivers/imx91: Add hwmon support
>
> Gopi Krishna Menon (2):
> thermal/drivers/spear: Fix error condition for reading
> st,thermal-flags
> dt-bindings: thermal: st,thermal-spear1340: convert to dtschema
>
> John Madieu (1):
> thermal: renesas: rzg3e: Remove stale @trim_offset kernel-doc entry
>
> Krzysztof Kozlowski (1):
> dt-bindings: thermal: qcom-tsens: Add Eliza SoC TSENS
>
> Manaf Meethalavalappu Pallikunhi (1):
> dt-bindings: thermal: qcom-tsens: Document the SM8750 Temperature
> Sensor
>
> Richard Acayan (2):
> dt-bindings: thermal: tsens: add SDM670 compatible
> dt-bindings: thermal: lmh: Add SDM670 compatible
>
> Thorsten Blum (4):
> thermal/drivers/sprd: Fix temperature clamping in
> sprd_thm_temp_to_rawdata
> thermal/drivers/sprd: Fix raw temperature clamping in
> sprd_thm_rawdata_to_temp
> thermal/drivers/sprd: Use min instead of clamp in
> sprd_thm_temp_to_rawdata
> thermal/drivers/brcmstb_thermal: Use max to simplify brcmstb_get_temp
>
> .../devicetree/bindings/thermal/qcom-lmh.yaml | 3 ++
> .../devicetree/bindings/thermal/qcom-tsens.yaml | 3 ++
> .../devicetree/bindings/thermal/spear-thermal.txt | 14 ---------
> .../bindings/thermal/st,thermal-spear1340.yaml | 36
> ++++++++++++++++++++++
> drivers/thermal/broadcom/brcmstb_thermal.c | 8 ++---
> drivers/thermal/imx91_thermal.c | 4 +++
> drivers/thermal/renesas/rzg3e_thermal.c | 1 -
> drivers/thermal/spear_thermal.c | 2 +-
> drivers/thermal/sprd_thermal.c | 6 ++--
> drivers/thermal/thermal_of.c | 20 ++++++------
> 10 files changed, 63 insertions(+), 34 deletions(-)
> delete mode 100644
> Documentation/devicetree/bindings/thermal/spear-thermal.txt
> create mode 100644
> Documentation/devicetree/bindings/thermal/st,thermal-spear1340.yaml
Pulled and added to linux-pm.git/linux-next, thanks!
^ permalink raw reply
* [PATCH v2 9/9] pmdomain: renesas: rmobile-sysc: Drop GENPD_FLAG_NO_STAY_ON
From: Ulf Hansson @ 2026-04-10 10:40 UTC (permalink / raw)
To: Saravana Kannan, Rafael J . Wysocki, Greg Kroah-Hartman, linux-pm
Cc: Sudeep Holla, Cristian Marussi, Kevin Hilman, Stephen Boyd,
Marek Szyprowski, Bjorn Andersson, Abel Vesa, Peng Fan,
Tomi Valkeinen, Maulik Shah, Konrad Dybcio, Thierry Reding,
Jonathan Hunter, Geert Uytterhoeven, Dmitry Baryshkov,
Ulf Hansson, linux-arm-kernel, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20260410104058.83748-1-ulf.hansson@linaro.org>
Rmobile-sysc is not a onecell provider and didn't really needed
the GENPD_FLAG_NO_STAY_ON flag in the first place. Let's drop it.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
drivers/pmdomain/renesas/rmobile-sysc.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/pmdomain/renesas/rmobile-sysc.c b/drivers/pmdomain/renesas/rmobile-sysc.c
index 93103ff33d6e..e36f5d763c91 100644
--- a/drivers/pmdomain/renesas/rmobile-sysc.c
+++ b/drivers/pmdomain/renesas/rmobile-sysc.c
@@ -100,8 +100,7 @@ static void rmobile_init_pm_domain(struct rmobile_pm_domain *rmobile_pd)
struct generic_pm_domain *genpd = &rmobile_pd->genpd;
struct dev_power_governor *gov = rmobile_pd->gov;
- genpd->flags |= GENPD_FLAG_PM_CLK | GENPD_FLAG_ACTIVE_WAKEUP |
- GENPD_FLAG_NO_STAY_ON;
+ genpd->flags |= GENPD_FLAG_PM_CLK | GENPD_FLAG_ACTIVE_WAKEUP;
genpd->attach_dev = cpg_mstp_attach_dev;
genpd->detach_dev = cpg_mstp_detach_dev;
--
2.43.0
^ permalink raw reply related
* [PATCH v2 8/9] pmdomain: renesas: rcar-sysc: Drop GENPD_FLAG_NO_STAY_ON
From: Ulf Hansson @ 2026-04-10 10:40 UTC (permalink / raw)
To: Saravana Kannan, Rafael J . Wysocki, Greg Kroah-Hartman, linux-pm
Cc: Sudeep Holla, Cristian Marussi, Kevin Hilman, Stephen Boyd,
Marek Szyprowski, Bjorn Andersson, Abel Vesa, Peng Fan,
Tomi Valkeinen, Maulik Shah, Konrad Dybcio, Thierry Reding,
Jonathan Hunter, Geert Uytterhoeven, Dmitry Baryshkov,
Ulf Hansson, linux-arm-kernel, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20260410104058.83748-1-ulf.hansson@linaro.org>
Due to the new fine grained sync_state support for onecell genpd provider
drivers, we should no longer need use the legacy behaviour. Therefore,
let's drop GENPD_FLAG_NO_STAY_ON.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
drivers/pmdomain/renesas/rcar-sysc.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/pmdomain/renesas/rcar-sysc.c b/drivers/pmdomain/renesas/rcar-sysc.c
index bd7bb9cbd9da..e4608c657629 100644
--- a/drivers/pmdomain/renesas/rcar-sysc.c
+++ b/drivers/pmdomain/renesas/rcar-sysc.c
@@ -241,7 +241,6 @@ static int __init rcar_sysc_pd_setup(struct rcar_sysc_pd *pd)
}
}
- genpd->flags |= GENPD_FLAG_NO_STAY_ON;
genpd->power_off = rcar_sysc_pd_power_off;
genpd->power_on = rcar_sysc_pd_power_on;
--
2.43.0
^ permalink raw reply related
* [PATCH v2 7/9] pmdomain: renesas: rcar-gen4-sysc: Drop GENPD_FLAG_NO_STAY_ON
From: Ulf Hansson @ 2026-04-10 10:40 UTC (permalink / raw)
To: Saravana Kannan, Rafael J . Wysocki, Greg Kroah-Hartman, linux-pm
Cc: Sudeep Holla, Cristian Marussi, Kevin Hilman, Stephen Boyd,
Marek Szyprowski, Bjorn Andersson, Abel Vesa, Peng Fan,
Tomi Valkeinen, Maulik Shah, Konrad Dybcio, Thierry Reding,
Jonathan Hunter, Geert Uytterhoeven, Dmitry Baryshkov,
Ulf Hansson, linux-arm-kernel, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20260410104058.83748-1-ulf.hansson@linaro.org>
Due to the new fine grained sync_state support for onecell genpd provider
drivers, we should no longer need use the legacy behaviour. Therefore,
let's drop GENPD_FLAG_NO_STAY_ON.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
drivers/pmdomain/renesas/rcar-gen4-sysc.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/pmdomain/renesas/rcar-gen4-sysc.c b/drivers/pmdomain/renesas/rcar-gen4-sysc.c
index 0c6c639a91d0..81b154da725f 100644
--- a/drivers/pmdomain/renesas/rcar-gen4-sysc.c
+++ b/drivers/pmdomain/renesas/rcar-gen4-sysc.c
@@ -251,7 +251,6 @@ static int __init rcar_gen4_sysc_pd_setup(struct rcar_gen4_sysc_pd *pd)
genpd->detach_dev = cpg_mssr_detach_dev;
}
- genpd->flags |= GENPD_FLAG_NO_STAY_ON;
genpd->power_off = rcar_gen4_sysc_pd_power_off;
genpd->power_on = rcar_gen4_sysc_pd_power_on;
--
2.43.0
^ permalink raw reply related
* [PATCH v2 6/9] pmdomain: core: Export a common function for ->queue_sync_state()
From: Ulf Hansson @ 2026-04-10 10:40 UTC (permalink / raw)
To: Saravana Kannan, Rafael J . Wysocki, Greg Kroah-Hartman, linux-pm
Cc: Sudeep Holla, Cristian Marussi, Kevin Hilman, Stephen Boyd,
Marek Szyprowski, Bjorn Andersson, Abel Vesa, Peng Fan,
Tomi Valkeinen, Maulik Shah, Konrad Dybcio, Thierry Reding,
Jonathan Hunter, Geert Uytterhoeven, Dmitry Baryshkov,
Ulf Hansson, linux-arm-kernel, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20260410104058.83748-1-ulf.hansson@linaro.org>
Along with of_genpd_sync_state() that genpd provider drivers may use to
manage sync_state, let's add and export of_genpd_queue_sync_state() for
those that may need it. It's expected that the genpd provider driver
assigns it's own ->queue_sync_state() callback and invoke the new helper
from there.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
drivers/pmdomain/core.c | 14 +++++++++-----
include/linux/pm_domain.h | 2 ++
2 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
index f11dc2110737..49e60cb67b3e 100644
--- a/drivers/pmdomain/core.c
+++ b/drivers/pmdomain/core.c
@@ -2764,7 +2764,7 @@ static void genpd_parse_for_consumer(struct device_node *sup,
}
}
-static void _genpd_queue_sync_state(struct device_node *np)
+static void genpd_queue_sync_state(struct device_node *np)
{
struct generic_pm_domain *genpd;
@@ -2782,11 +2782,14 @@ static void _genpd_queue_sync_state(struct device_node *np)
mutex_unlock(&gpd_list_lock);
}
-static void genpd_queue_sync_state(struct device *dev)
+void of_genpd_queue_sync_state(struct device *dev)
{
struct device_node *np = dev->of_node;
struct device_link *link;
+ if (!np)
+ return;
+
if (!genpd_should_wait_for_consumer(np))
return;
@@ -2810,8 +2813,9 @@ static void genpd_queue_sync_state(struct device *dev)
genpd_parse_for_consumer(np, consumer->of_node);
}
- _genpd_queue_sync_state(np);
+ genpd_queue_sync_state(np);
}
+EXPORT_SYMBOL_GPL(of_genpd_queue_sync_state);
static void genpd_sync_state(struct device *dev)
{
@@ -2922,7 +2926,7 @@ int of_genpd_add_provider_onecell(struct device_node *np,
sync_state = true;
} else if (!dev_has_sync_state(dev)) {
dev_set_drv_sync_state(dev, genpd_sync_state);
- dev_set_drv_queue_sync_state(dev, genpd_queue_sync_state);
+ dev_set_drv_queue_sync_state(dev, of_genpd_queue_sync_state);
}
put_device(dev);
@@ -3654,7 +3658,7 @@ static void genpd_provider_queue_sync_state(struct device *dev)
if (genpd->sync_state != GENPD_SYNC_STATE_ONECELL)
return;
- genpd_queue_sync_state(dev);
+ of_genpd_queue_sync_state(dev);
}
static void genpd_provider_sync_state(struct device *dev)
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 7aa49721cde5..d428dd805c46 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -467,6 +467,7 @@ int of_genpd_remove_subdomain(const struct of_phandle_args *parent_spec,
struct generic_pm_domain *of_genpd_remove_last(struct device_node *np);
int of_genpd_parse_idle_states(struct device_node *dn,
struct genpd_power_state **states, int *n);
+void of_genpd_queue_sync_state(struct device *dev);
void of_genpd_sync_state(struct device_node *np);
int genpd_dev_pm_attach(struct device *dev);
@@ -513,6 +514,7 @@ static inline int of_genpd_parse_idle_states(struct device_node *dn,
return -ENODEV;
}
+static inline void of_genpd_queue_sync_state(struct device *dev) {}
static inline void of_genpd_sync_state(struct device_node *np) {}
static inline int genpd_dev_pm_attach(struct device *dev)
--
2.43.0
^ permalink raw reply related
* [PATCH v2 5/9] pmdomain: core: Extend fine grained sync_state to more onecell providers
From: Ulf Hansson @ 2026-04-10 10:40 UTC (permalink / raw)
To: Saravana Kannan, Rafael J . Wysocki, Greg Kroah-Hartman, linux-pm
Cc: Sudeep Holla, Cristian Marussi, Kevin Hilman, Stephen Boyd,
Marek Szyprowski, Bjorn Andersson, Abel Vesa, Peng Fan,
Tomi Valkeinen, Maulik Shah, Konrad Dybcio, Thierry Reding,
Jonathan Hunter, Geert Uytterhoeven, Dmitry Baryshkov,
Ulf Hansson, linux-arm-kernel, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20260410104058.83748-1-ulf.hansson@linaro.org>
A onecell power domain provider driver that we can assign a common
->sync_state() callback for, should be able to benefit from the improved
fine grained sync_state support in genpd. Therefore, let's also assign the
->queue_sync_state() callback for these types of provider drivers.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
drivers/pmdomain/core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
index 783d6f981708..f11dc2110737 100644
--- a/drivers/pmdomain/core.c
+++ b/drivers/pmdomain/core.c
@@ -2918,10 +2918,12 @@ int of_genpd_add_provider_onecell(struct device_node *np,
fwnode = of_fwnode_handle(np);
dev = get_dev_from_fwnode(fwnode);
- if (!dev)
+ if (!dev) {
sync_state = true;
- else
+ } else if (!dev_has_sync_state(dev)) {
dev_set_drv_sync_state(dev, genpd_sync_state);
+ dev_set_drv_queue_sync_state(dev, genpd_queue_sync_state);
+ }
put_device(dev);
--
2.43.0
^ permalink raw reply related
* [PATCH v2 4/9] pmdomain: core: Add initial fine grained sync_state support
From: Ulf Hansson @ 2026-04-10 10:40 UTC (permalink / raw)
To: Saravana Kannan, Rafael J . Wysocki, Greg Kroah-Hartman, linux-pm
Cc: Sudeep Holla, Cristian Marussi, Kevin Hilman, Stephen Boyd,
Marek Szyprowski, Bjorn Andersson, Abel Vesa, Peng Fan,
Tomi Valkeinen, Maulik Shah, Konrad Dybcio, Thierry Reding,
Jonathan Hunter, Geert Uytterhoeven, Dmitry Baryshkov,
Ulf Hansson, linux-arm-kernel, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20260410104058.83748-1-ulf.hansson@linaro.org>
A onecell (#power-domain-cells = <1 or 2>; in DT) power domain provider
typically provides multiple independent power domains, each with their own
corresponding consumers. In these cases we have to wait for all consumers
for all the provided power domains before the ->sync_state() callback gets
called for the supplier.
In a first step to improve this, let's implement support for fine grained
sync_state support a per genpd basis by using the ->queue_sync_state()
callback. To take step by step, let's initially limit the improvement to
the internal genpd provider driver and to its corresponding genpd devices
for onecell providers.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
drivers/pmdomain/core.c | 125 ++++++++++++++++++++++++++++++++++++++
include/linux/pm_domain.h | 1 +
2 files changed, 126 insertions(+)
diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
index ad57846f02a3..783d6f981708 100644
--- a/drivers/pmdomain/core.c
+++ b/drivers/pmdomain/core.c
@@ -2699,6 +2699,120 @@ static struct generic_pm_domain *genpd_get_from_provider(
return genpd;
}
+static bool genpd_should_wait_for_consumer(struct device_node *np)
+{
+ struct generic_pm_domain *genpd;
+ bool should_wait = false;
+
+ mutex_lock(&gpd_list_lock);
+ list_for_each_entry(genpd, &gpd_list, gpd_list_node) {
+ if (genpd->provider == of_fwnode_handle(np)) {
+ genpd_lock(genpd);
+
+ /* Clear the previous state before reevaluating. */
+ genpd->wait_for_consumer = false;
+
+ /*
+ * Unless there is at least one genpd for the provider
+ * that is being kept powered-on, we don't have to care
+ * about waiting for consumers.
+ */
+ if (genpd->stay_on)
+ should_wait = true;
+
+ genpd_unlock(genpd);
+ }
+ }
+ mutex_unlock(&gpd_list_lock);
+
+ return should_wait;
+}
+
+static void genpd_parse_for_consumer(struct device_node *sup,
+ struct device_node *con)
+{
+ struct generic_pm_domain *genpd;
+ int i;
+
+ for (i = 0; ; i++) {
+ struct of_phandle_args pd_args;
+
+ if (of_parse_phandle_with_args(con, "power-domains",
+ "#power-domain-cells",
+ i, &pd_args))
+ break;
+
+ /*
+ * The phandle must correspond to the supplier's genpd provider
+ * to be relevant else let's move to the next index.
+ */
+ if (sup != pd_args.np) {
+ of_node_put(pd_args.np);
+ continue;
+ }
+
+ mutex_lock(&gpd_list_lock);
+ genpd = genpd_get_from_provider(&pd_args);
+ if (!IS_ERR(genpd)) {
+ genpd_lock(genpd);
+ genpd->wait_for_consumer = true;
+ genpd_unlock(genpd);
+ }
+ mutex_unlock(&gpd_list_lock);
+
+ of_node_put(pd_args.np);
+ }
+}
+
+static void _genpd_queue_sync_state(struct device_node *np)
+{
+ struct generic_pm_domain *genpd;
+
+ mutex_lock(&gpd_list_lock);
+ list_for_each_entry(genpd, &gpd_list, gpd_list_node) {
+ if (genpd->provider == of_fwnode_handle(np)) {
+ genpd_lock(genpd);
+ if (genpd->stay_on && !genpd->wait_for_consumer) {
+ genpd->stay_on = false;
+ genpd_queue_power_off_work(genpd);
+ }
+ genpd_unlock(genpd);
+ }
+ }
+ mutex_unlock(&gpd_list_lock);
+}
+
+static void genpd_queue_sync_state(struct device *dev)
+{
+ struct device_node *np = dev->of_node;
+ struct device_link *link;
+
+ if (!genpd_should_wait_for_consumer(np))
+ return;
+
+ list_for_each_entry(link, &dev->links.consumers, s_node) {
+ struct device *consumer = link->consumer;
+
+ if (!device_link_test(link, DL_FLAG_MANAGED))
+ continue;
+
+ if (link->status == DL_STATE_ACTIVE)
+ continue;
+
+ if (!consumer->of_node)
+ continue;
+
+ /*
+ * A consumer device has not been probed yet. Let's parse its
+ * device node for the power-domains property, to find out the
+ * genpds it may belong to and then prevent sync state for them.
+ */
+ genpd_parse_for_consumer(np, consumer->of_node);
+ }
+
+ _genpd_queue_sync_state(np);
+}
+
static void genpd_sync_state(struct device *dev)
{
return of_genpd_sync_state(dev->of_node);
@@ -3531,6 +3645,16 @@ static int genpd_provider_probe(struct device *dev)
return 0;
}
+static void genpd_provider_queue_sync_state(struct device *dev)
+{
+ struct generic_pm_domain *genpd = container_of(dev, struct generic_pm_domain, dev);
+
+ if (genpd->sync_state != GENPD_SYNC_STATE_ONECELL)
+ return;
+
+ genpd_queue_sync_state(dev);
+}
+
static void genpd_provider_sync_state(struct device *dev)
{
struct generic_pm_domain *genpd = container_of(dev, struct generic_pm_domain, dev);
@@ -3559,6 +3683,7 @@ static struct device_driver genpd_provider_drv = {
.name = "genpd_provider",
.bus = &genpd_provider_bus_type,
.probe = genpd_provider_probe,
+ .queue_sync_state = genpd_provider_queue_sync_state,
.sync_state = genpd_provider_sync_state,
.suppress_bind_attrs = true,
};
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index b299dc0128d6..7aa49721cde5 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -215,6 +215,7 @@ struct generic_pm_domain {
cpumask_var_t cpus; /* A cpumask of the attached CPUs */
bool synced_poweroff; /* A consumer needs a synced poweroff */
bool stay_on; /* Stay powered-on during boot. */
+ bool wait_for_consumer; /* Consumers awaits to be probed. */
enum genpd_sync_state sync_state; /* How sync_state is managed. */
int (*power_off)(struct generic_pm_domain *domain);
int (*power_on)(struct generic_pm_domain *domain);
--
2.43.0
^ permalink raw reply related
* [PATCH v2 3/9] pmdomain: core: Move genpd_get_from_provider()
From: Ulf Hansson @ 2026-04-10 10:40 UTC (permalink / raw)
To: Saravana Kannan, Rafael J . Wysocki, Greg Kroah-Hartman, linux-pm
Cc: Sudeep Holla, Cristian Marussi, Kevin Hilman, Stephen Boyd,
Marek Szyprowski, Bjorn Andersson, Abel Vesa, Peng Fan,
Tomi Valkeinen, Maulik Shah, Konrad Dybcio, Thierry Reding,
Jonathan Hunter, Geert Uytterhoeven, Dmitry Baryshkov,
Ulf Hansson, linux-arm-kernel, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20260410104058.83748-1-ulf.hansson@linaro.org>
To prepare for subsequent changes and to avoid an unnecessary function
declaration, let's move genpd_get_from_provider() a bit earlier in the
code.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
drivers/pmdomain/core.c | 70 ++++++++++++++++++++---------------------
1 file changed, 35 insertions(+), 35 deletions(-)
diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
index 4d32fc676aaf..ad57846f02a3 100644
--- a/drivers/pmdomain/core.c
+++ b/drivers/pmdomain/core.c
@@ -2664,6 +2664,41 @@ static bool genpd_present(const struct generic_pm_domain *genpd)
return ret;
}
+/**
+ * genpd_get_from_provider() - Look-up PM domain
+ * @genpdspec: OF phandle args to use for look-up
+ *
+ * Looks for a PM domain provider under the node specified by @genpdspec and if
+ * found, uses xlate function of the provider to map phandle args to a PM
+ * domain.
+ *
+ * Returns a valid pointer to struct generic_pm_domain on success or ERR_PTR()
+ * on failure.
+ */
+static struct generic_pm_domain *genpd_get_from_provider(
+ const struct of_phandle_args *genpdspec)
+{
+ struct generic_pm_domain *genpd = ERR_PTR(-ENOENT);
+ struct of_genpd_provider *provider;
+
+ if (!genpdspec)
+ return ERR_PTR(-EINVAL);
+
+ mutex_lock(&of_genpd_mutex);
+
+ /* Check if we have such a provider in our array */
+ list_for_each_entry(provider, &of_genpd_providers, link) {
+ if (provider->node == genpdspec->np)
+ genpd = provider->xlate(genpdspec, provider->data);
+ if (!IS_ERR(genpd))
+ break;
+ }
+
+ mutex_unlock(&of_genpd_mutex);
+
+ return genpd;
+}
+
static void genpd_sync_state(struct device *dev)
{
return of_genpd_sync_state(dev->of_node);
@@ -2889,41 +2924,6 @@ void of_genpd_del_provider(struct device_node *np)
}
EXPORT_SYMBOL_GPL(of_genpd_del_provider);
-/**
- * genpd_get_from_provider() - Look-up PM domain
- * @genpdspec: OF phandle args to use for look-up
- *
- * Looks for a PM domain provider under the node specified by @genpdspec and if
- * found, uses xlate function of the provider to map phandle args to a PM
- * domain.
- *
- * Returns a valid pointer to struct generic_pm_domain on success or ERR_PTR()
- * on failure.
- */
-static struct generic_pm_domain *genpd_get_from_provider(
- const struct of_phandle_args *genpdspec)
-{
- struct generic_pm_domain *genpd = ERR_PTR(-ENOENT);
- struct of_genpd_provider *provider;
-
- if (!genpdspec)
- return ERR_PTR(-EINVAL);
-
- mutex_lock(&of_genpd_mutex);
-
- /* Check if we have such a provider in our array */
- list_for_each_entry(provider, &of_genpd_providers, link) {
- if (provider->node == genpdspec->np)
- genpd = provider->xlate(genpdspec, provider->data);
- if (!IS_ERR(genpd))
- break;
- }
-
- mutex_unlock(&of_genpd_mutex);
-
- return genpd;
-}
-
/**
* of_genpd_add_device() - Add a device to an I/O PM domain
* @genpdspec: OF phandle args to use for look-up PM domain
--
2.43.0
^ permalink raw reply related
* [PATCH v2 2/9] driver core: Add dev_set_drv_queue_sync_state()
From: Ulf Hansson @ 2026-04-10 10:40 UTC (permalink / raw)
To: Saravana Kannan, Rafael J . Wysocki, Greg Kroah-Hartman, linux-pm
Cc: Sudeep Holla, Cristian Marussi, Kevin Hilman, Stephen Boyd,
Marek Szyprowski, Bjorn Andersson, Abel Vesa, Peng Fan,
Tomi Valkeinen, Maulik Shah, Konrad Dybcio, Thierry Reding,
Jonathan Hunter, Geert Uytterhoeven, Dmitry Baryshkov,
Ulf Hansson, linux-arm-kernel, linux-kernel, Geert Uytterhoeven
In-Reply-To: <20260410104058.83748-1-ulf.hansson@linaro.org>
Similar to the dev_set_drv_sync_state() helper, let's add another one to
allow subsystem level code to set the ->queue_sync_state() callback for a
driver that has not already set it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
include/linux/device.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/include/linux/device.h b/include/linux/device.h
index e65d564f01cd..f812e70bdf22 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -994,6 +994,18 @@ static inline int dev_set_drv_sync_state(struct device *dev,
return 0;
}
+static inline int dev_set_drv_queue_sync_state(struct device *dev,
+ void (*fn)(struct device *dev))
+{
+ if (!dev || !dev->driver)
+ return 0;
+ if (dev->driver->queue_sync_state && dev->driver->queue_sync_state != fn)
+ return -EBUSY;
+ if (!dev->driver->queue_sync_state)
+ dev->driver->queue_sync_state = fn;
+ return 0;
+}
+
static inline void dev_set_removable(struct device *dev,
enum device_removable removable)
{
--
2.43.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox