Linux Power Management development
 help / color / mirror / Atom feed
* [PATCH 0/2] powercap: dtpm: Guard against missing energy model in dtpm callbacks
@ 2026-06-11 20:46 Elazar Leibovich
  2026-06-11 20:46 ` [PATCH 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks Elazar Leibovich
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Elazar Leibovich @ 2026-06-11 20:46 UTC (permalink / raw)
  To: linux-pm; +Cc: Rafael J . Wysocki, Daniel Lezcano, linux-kernel,
	Elazar Leibovich

The dtpm_cpu and dtpm_devfreq powercap drivers dereference the perf
domain returned by em_cpu_get()/em_pd_get() without checking it for
NULL. When the energy model is absent - for instance when a CPU becomes
impossible at runtime so get_cpu_device() returns NULL, or when the
device EM was never registered or has already been unregistered - the
dtpm callbacks, several of which are reachable directly from sysfs via
the powercap zone attributes, crash on a NULL pointer dereference.

Add the missing NULL checks, mirroring the guard that already exists in
get_pd_power_uw() since commit 46dc57406887 ("powercap: dtpm_cpu: Fix
NULL pointer dereference in get_pd_power_uw()").

Note these patches only handle a perf domain that is already gone when
the callback starts. A callback racing with
em_dev_unregister_perf_domain() can still observe a non-NULL perf
domain that is freed underneath it, because only the EM perf state
table is RCU-protected, not struct em_perf_domain itself. Closing that
race needs RCU lifetime guarantees from the EM core plus holding
rcu_read_lock() across the perf domain access in the readers; this will
be addressed in future patches.

Sivan Zohar-Kotzer (2):
  powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs
    callbacks
  powercap: dtpm_devfreq: Guard em_pd_get() against NULL return in
    callbacks

 drivers/powercap/dtpm_cpu.c     | 6 ++++++
 drivers/powercap/dtpm_devfreq.c | 9 +++++++++
 2 files changed, 15 insertions(+)

-- 
2.50.1 (Apple Git-155)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks
  2026-06-11 20:46 [PATCH 0/2] powercap: dtpm: Guard against missing energy model in dtpm callbacks Elazar Leibovich
@ 2026-06-11 20:46 ` Elazar Leibovich
  2026-06-23 10:12   ` Daniel Lezcano
  2026-06-11 20:46 ` [PATCH 2/2] powercap: dtpm_devfreq: Guard em_pd_get() against NULL return in callbacks Elazar Leibovich
  2026-06-24 20:31 ` [PATCH v2 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks sivany32
  2 siblings, 1 reply; 7+ messages in thread
From: Elazar Leibovich @ 2026-06-11 20:46 UTC (permalink / raw)
  To: linux-pm
  Cc: Rafael J . Wysocki, Daniel Lezcano, linux-kernel,
	Sivan Zohar-Kotzer, Elazar Leibovich

From: Sivan Zohar-Kotzer <sivany32@gmail.com>

em_cpu_get() can return NULL when the CPU has no energy model registered
(e.g. during hotplug races or initialization failures). Two call sites
miss the NULL test:

1. set_pd_power_limit() — reachable from sysfs via the powercap
   constraint power_limit_uw store path. If NULL, return the current
   power limit unchanged (no-op) instead of crashing.

2. update_pd_power_uw() — called from CPU hotplug handlers via
   dtpm_update_power(). Return -EINVAL if the EM is absent, since this
   is an internal consistency failure that should not be swallowed.

The other sysfs-reachable path, get_pd_power_uw(), already has a NULL
guard returning 0.

Fixes: 0e8f68d7f048 ("powercap/drivers/dtpm: Add CPU energy model based support")
Signed-off-by: Sivan Zohar-Kotzer <sivany32@gmail.com>
Co-developed-by: Elazar Leibovich <elazarl@gmail.com>
Signed-off-by: Elazar Leibovich <elazarl@gmail.com>
---
 drivers/powercap/dtpm_cpu.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 99390ec1481f..0a460f97bf15 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -47,6 +47,9 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
 	u64 power;
 	int i, nr_cpus;
 
+	if (!pd)
+		return dtpm->power_limit;
+
 	nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(pd->cpus));
 
 	rcu_read_lock();
@@ -125,6 +128,9 @@ static int update_pd_power_uw(struct dtpm *dtpm)
 	struct em_perf_state *table;
 	int nr_cpus;
 
+	if (!em)
+		return -EINVAL;
+
 	nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(em->cpus));
 
 	rcu_read_lock();
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] powercap: dtpm_devfreq: Guard em_pd_get() against NULL return in callbacks
  2026-06-11 20:46 [PATCH 0/2] powercap: dtpm: Guard against missing energy model in dtpm callbacks Elazar Leibovich
  2026-06-11 20:46 ` [PATCH 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks Elazar Leibovich
@ 2026-06-11 20:46 ` Elazar Leibovich
  2026-06-24 20:31 ` [PATCH v2 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks sivany32
  2 siblings, 0 replies; 7+ messages in thread
From: Elazar Leibovich @ 2026-06-11 20:46 UTC (permalink / raw)
  To: linux-pm
  Cc: Rafael J . Wysocki, Daniel Lezcano, linux-kernel,
	Sivan Zohar-Kotzer, Elazar Leibovich

From: Sivan Zohar-Kotzer <sivany32@gmail.com>

em_pd_get() returns NULL when the device has no energy model
registered. All three dtpm callbacks dereference the perf domain
without checking it:

1. set_pd_power_limit() — reachable from sysfs via the powercap
   constraint power_limit_uw store path. If NULL, return the current
   power limit unchanged (no-op) instead of crashing.

2. get_pd_power_uw() — reachable from sysfs via the power_uw read
   path. Return 0 if the EM is absent.

3. update_pd_power_uw() — called via dtpm_update_power(). Return
   -EINVAL if the EM is absent.

This mirrors the equivalent NULL checks in dtpm_cpu.c.

Note this only handles the case where the EM is already gone (or was
never registered) when the callback starts. A callback racing with
em_dev_unregister_perf_domain() can still observe a non-NULL perf
domain that is freed underneath it; closing that race requires
lifetime guarantees from the EM core and is addressed separately.

Fixes: e44655617317 ("powercap/drivers/dtpm: Add dtpm devfreq with energy model support")
Signed-off-by: Sivan Zohar-Kotzer <sivany32@gmail.com>
Co-developed-by: Elazar Leibovich <elazarl@gmail.com>
Signed-off-by: Elazar Leibovich <elazarl@gmail.com>
---
 drivers/powercap/dtpm_devfreq.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/powercap/dtpm_devfreq.c b/drivers/powercap/dtpm_devfreq.c
index d1dff6ccab12..cf16e2756481 100644
--- a/drivers/powercap/dtpm_devfreq.c
+++ b/drivers/powercap/dtpm_devfreq.c
@@ -39,6 +39,9 @@ static int update_pd_power_uw(struct dtpm *dtpm)
 	struct em_perf_domain *pd = em_pd_get(dev);
 	struct em_perf_state *table;
 
+	if (!pd)
+		return -EINVAL;
+
 	rcu_read_lock();
 	table = em_perf_state_from_pd(pd);
 
@@ -60,6 +63,9 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
 	unsigned long freq;
 	int i;
 
+	if (!pd)
+		return dtpm->power_limit;
+
 	rcu_read_lock();
 	table = em_perf_state_from_pd(pd);
 	for (i = 0; i < pd->nr_perf_states; i++) {
@@ -102,6 +108,9 @@ static u64 get_pd_power_uw(struct dtpm *dtpm)
 	u64 power = 0;
 	int i;
 
+	if (!pd)
+		return 0;
+
 	mutex_lock(&devfreq->lock);
 	status = devfreq->last_status;
 	mutex_unlock(&devfreq->lock);
-- 
2.50.1 (Apple Git-155)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks
  2026-06-11 20:46 ` [PATCH 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks Elazar Leibovich
@ 2026-06-23 10:12   ` Daniel Lezcano
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Lezcano @ 2026-06-23 10:12 UTC (permalink / raw)
  To: Elazar Leibovich, linux-pm
  Cc: Rafael J . Wysocki, Daniel Lezcano, linux-kernel,
	Sivan Zohar-Kotzer


Hi,

On 6/11/26 22:46, Elazar Leibovich wrote:
> From: Sivan Zohar-Kotzer <sivany32@gmail.com>
> 
> em_cpu_get() can return NULL when the CPU has no energy model registered
> (e.g. during hotplug races or initialization failures). Two call sites
> miss the NULL test:
> 
> 1. set_pd_power_limit() — reachable from sysfs via the powercap
>     constraint power_limit_uw store path. If NULL, return the current
>     power limit unchanged (no-op) instead of crashing.

If there is no energy model, then __dtpm_cpu_setup() fails and dtpm_ops 
is not set. Consequently, set_pd_power_limit() can not be called.


> 2. update_pd_power_uw() — called from CPU hotplug handlers via
>     dtpm_update_power(). Return -EINVAL if the EM is absent, since this
>     is an internal consistency failure that should not be swallowed.

If there is no energy model, then __dtpm_cpu_setup() fails and 
dtpm_per_cpu is not set and dtpm_update_power() is not called.

online/offline:

         if (dtpm_cpu)
                 dtpm_update_power(&dtpm_cpu->dtpm);


> The other sysfs-reachable path, get_pd_power_uw(), already has a NULL
> guard returning 0.
> 
> Fixes: 0e8f68d7f048 ("powercap/drivers/dtpm: Add CPU energy model based support")
> Signed-off-by: Sivan Zohar-Kotzer <sivany32@gmail.com>
> Co-developed-by: Elazar Leibovich <elazarl@gmail.com>
> Signed-off-by: Elazar Leibovich <elazarl@gmail.com>
> ---
>   drivers/powercap/dtpm_cpu.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
> index 99390ec1481f..0a460f97bf15 100644
> --- a/drivers/powercap/dtpm_cpu.c
> +++ b/drivers/powercap/dtpm_cpu.c
> @@ -47,6 +47,9 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
>   	u64 power;
>   	int i, nr_cpus;
>   
> +	if (!pd)
> +		return dtpm->power_limit;
> +
>   	nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(pd->cpus));
>   
>   	rcu_read_lock();
> @@ -125,6 +128,9 @@ static int update_pd_power_uw(struct dtpm *dtpm)
>   	struct em_perf_state *table;
>   	int nr_cpus;
>   
> +	if (!em)
> +		return -EINVAL;
> +
>   	nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(em->cpus));
>   
>   	rcu_read_lock();



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks
  2026-06-11 20:46 [PATCH 0/2] powercap: dtpm: Guard against missing energy model in dtpm callbacks Elazar Leibovich
  2026-06-11 20:46 ` [PATCH 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks Elazar Leibovich
  2026-06-11 20:46 ` [PATCH 2/2] powercap: dtpm_devfreq: Guard em_pd_get() against NULL return in callbacks Elazar Leibovich
@ 2026-06-24 20:31 ` sivany32
  2026-06-24 20:44   ` Daniel Lezcano
  2 siblings, 1 reply; 7+ messages in thread
From: sivany32 @ 2026-06-24 20:31 UTC (permalink / raw)
  To: linux-pm
  Cc: rafael, daniel.lezcano, linux-kernel, elazarl, Sivan Zohar-Kotzer

From: Sivan Zohar-Kotzer <sivany32@gmail.com>

em_cpu_get() can return NULL when a CPU becomes impossible. Two call
sites miss the NULL test:

1. set_pd_power_limit() — reachable from sysfs via the powercap
   constraint power_limit_uw store path. If the CPU has become
   impossible then pd will be NULL, return the current power limit
   unchanged (no-op) instead of crashing.

2. update_pd_power_uw() — called from CPU hotplug handlers via
   dtpm_update_power(). While reaching this with a NULL em may be
   technically impossible today, the check is kept for defense in
   depth against future refactoring or subtle races.

The other sysfs-reachable path, get_pd_power_uw(), already has a NULL
guard returning 0.

Fixes: 0e8f68d7f048 ("powercap/drivers/dtpm: Add CPU energy model based support")
Signed-off-by: Sivan Zohar-Kotzer <sivany32@gmail.com>
Co-developed-by: Elazar Leibovich <elazarl@gmail.com>
Signed-off-by: Elazar Leibovich <elazarl@gmail.com>
---
 drivers/powercap/dtpm_cpu.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
index 21355db64..886cbe922 100644
--- a/drivers/powercap/dtpm_cpu.c
+++ b/drivers/powercap/dtpm_cpu.c
@@ -47,6 +47,9 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
 	u64 power;
 	int i, nr_cpus;
 
+	if (!pd)
+		return dtpm->power_limit;
+
 	nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(pd->cpus));
 
 	rcu_read_lock();
@@ -125,6 +128,9 @@ static int update_pd_power_uw(struct dtpm *dtpm)
 	struct em_perf_state *table;
 	int nr_cpus;
 
+	if (!em)
+		return -EINVAL;
+
 	nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(em->cpus));
 
 	rcu_read_lock();
-- 
2.39.5 (Apple Git-154)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks
  2026-06-24 20:31 ` [PATCH v2 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks sivany32
@ 2026-06-24 20:44   ` Daniel Lezcano
  2026-06-24 21:03     ` Elazar Leibovich
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Lezcano @ 2026-06-24 20:44 UTC (permalink / raw)
  To: sivany32, linux-pm; +Cc: rafael, daniel.lezcano, linux-kernel, elazarl

On 6/24/26 22:31, sivany32@gmail.com wrote:
> From: Sivan Zohar-Kotzer <sivany32@gmail.com>
> 
> em_cpu_get() can return NULL when a CPU becomes impossible. Two call
> sites miss the NULL test:
> 
> 1. set_pd_power_limit() — reachable from sysfs via the powercap
>     constraint power_limit_uw store path. If the CPU has become
>     impossible then pd will be NULL, return the current power limit
>     unchanged (no-op) instead of crashing.

Can you provide a script or a recipe spotting these issues ?

> 2. update_pd_power_uw() — called from CPU hotplug handlers via
>     dtpm_update_power(). While reaching this with a NULL em may be
>     technically impossible today, the check is kept for defense in
>     depth against future refactoring or subtle races.
> 
> The other sysfs-reachable path, get_pd_power_uw(), already has a NULL
> guard returning 0.
> 
> Fixes: 0e8f68d7f048 ("powercap/drivers/dtpm: Add CPU energy model based support")
> Signed-off-by: Sivan Zohar-Kotzer <sivany32@gmail.com>
> Co-developed-by: Elazar Leibovich <elazarl@gmail.com>
> Signed-off-by: Elazar Leibovich <elazarl@gmail.com>
> ---
>   drivers/powercap/dtpm_cpu.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/powercap/dtpm_cpu.c b/drivers/powercap/dtpm_cpu.c
> index 21355db64..886cbe922 100644
> --- a/drivers/powercap/dtpm_cpu.c
> +++ b/drivers/powercap/dtpm_cpu.c
> @@ -47,6 +47,9 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
>   	u64 power;
>   	int i, nr_cpus;
>   
> +	if (!pd)
> +		return dtpm->power_limit;
> +
>   	nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(pd->cpus));
>   
>   	rcu_read_lock();
> @@ -125,6 +128,9 @@ static int update_pd_power_uw(struct dtpm *dtpm)
>   	struct em_perf_state *table;
>   	int nr_cpus;
>   
> +	if (!em)
> +		return -EINVAL;
> +
>   	nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(em->cpus));
>   
>   	rcu_read_lock();


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks
  2026-06-24 20:44   ` Daniel Lezcano
@ 2026-06-24 21:03     ` Elazar Leibovich
  0 siblings, 0 replies; 7+ messages in thread
From: Elazar Leibovich @ 2026-06-24 21:03 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: sivany32, linux-pm, rafael, daniel.lezcano, linux-kernel

On Wed, 24 Jun 2026 at 23:44, Daniel Lezcano
<daniel.lezcano@oss.qualcomm.com> wrote:
>
>
> Can you provide a script or a recipe spotting these issues ?
>
Thanks!

I'll describe the thought process.

In 46dc57406887 we found a single instance of `em_cpu_get()` without a
NULL check even though it can return a NULL. I think the catalyst was
seeing that this function could return NULL and not seeing NULL test
on a call site.

After some discussion we concluded that when a CPU becomes impossible
`em_cpu_get` can race and dereference a NULL pointer theoretically.

While we acknowledge this isn't a huge issue, a NULL test in a
non-critical path doesn't seem like a huge issue either.

We grep'ed the code for other usages of em_cpu_get, and traced the
calling process to see if they can race a CPU becoming impossible.

(While looking at the code, we noticed the rcu does not guard the PD
pointer, but that's for another WIP patch)

I'm not sure if that's was the question, but we used AI to:

1. Trace the possible call stack of a function. It did a tremendous
job there. Faster and more precise than myself, seeing through
"vtables" and describing the callstack correctly.
2. Find similar cases of RCU used in the kernel, and try to imitate
the behavior (for the not-yet-sent RCU WIP patch)
3. General kernel chores (find maintainers list, running sparse, etc.)
4. Create a reproducer for the RCU race to test it on a VM.

Let me know if you want more details!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-24 21:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 20:46 [PATCH 0/2] powercap: dtpm: Guard against missing energy model in dtpm callbacks Elazar Leibovich
2026-06-11 20:46 ` [PATCH 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks Elazar Leibovich
2026-06-23 10:12   ` Daniel Lezcano
2026-06-11 20:46 ` [PATCH 2/2] powercap: dtpm_devfreq: Guard em_pd_get() against NULL return in callbacks Elazar Leibovich
2026-06-24 20:31 ` [PATCH v2 1/2] powercap: dtpm_cpu: Guard em_cpu_get() against NULL return in sysfs callbacks sivany32
2026-06-24 20:44   ` Daniel Lezcano
2026-06-24 21:03     ` Elazar Leibovich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox