* [PATCH] cpufreq: Fix race between suspend/resume and CPU hotplug
@ 2026-04-07 9:35 Tianxiang Chen
2026-04-07 11:50 ` Rafael J. Wysocki
0 siblings, 1 reply; 5+ messages in thread
From: Tianxiang Chen @ 2026-04-07 9:35 UTC (permalink / raw)
To: rafael; +Cc: viresh.kumar, linux-pm, linux-kernel, lingyue, Tianxiang Chen
CPU hotplug operations can race with cpufreq_suspend()
and cpufreq_resume(), leading to null pointer dereferences
when accessing governor data. This occurs because there is
no synchronization between suspend/resume operations and
CPU hotplug, allowing concurrent access to
policy->governor_data while it is being freed or initialized.
Detailed race condition scenario:
1. Thread A (cpufreq_suspend) starts execution:
- Iterates through active policies
- Calls cpufreq_stop_governor(policy) for each policy
- Sets cpufreq_suspended = true
2. Thread B (CPU hotplug) executes concurrently:
- Calls cpu_down(cpu)
- Calls cpuhp_cpufreq_offline(cpu)
- Calls cpufreq_offline(cpu)
- Inside cpufreq_offline():
* Stops governor: policy->governor->stop(policy)
* Exits governor: policy->governor->exit(policy)
* Frees governor_data: kfree(policy->governor_data)
* Sets policy->governor_data = NULL
3. Race window between step 1 and step 2:
- Thread A is iterating policies and stopping governors
- Thread B is concurrently executing CPU offline
- Both threads may access the same policy->governor_data
- Thread B frees governor_data while Thread A is still using it
- Thread A accesses freed governor_data → null pointer dereference
Similarly, cpufreq_resume() can race with CPU hotplug where governor_data
is being initialized while hotplug is trying to access it, leading to
accessing uninitialized data.
Signed-off-by: Tianxiang Chen <nanmu@xiaomi.com>
---
drivers/cpufreq/cpufreq.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 1f794524a1d9..8b03785764fa 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1979,6 +1979,7 @@ void cpufreq_suspend(void)
if (!cpufreq_driver)
return;
+ cpus_read_lock();
if (!has_target() && !cpufreq_driver->suspend)
goto suspend;
@@ -1998,6 +1999,7 @@ void cpufreq_suspend(void)
suspend:
cpufreq_suspended = true;
+ cpus_read_unlock();
}
/**
@@ -2017,10 +2019,11 @@ void cpufreq_resume(void)
if (unlikely(!cpufreq_suspended))
return;
+ cpus_read_lock();
cpufreq_suspended = false;
if (!has_target() && !cpufreq_driver->resume)
- return;
+ goto out;
pr_debug("%s: Resuming Governors\n", __func__);
@@ -2038,6 +2041,9 @@ void cpufreq_resume(void)
__func__, policy->cpu);
}
}
+
+out:
+ cpus_read_unlock();
}
/**
--
2.34.1
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] cpufreq: Fix race between suspend/resume and CPU hotplug
2026-04-07 9:35 [PATCH] cpufreq: Fix race between suspend/resume and CPU hotplug Tianxiang Chen
@ 2026-04-07 11:50 ` Rafael J. Wysocki
2026-04-08 1:46 ` [PATCH] cpufreq: fix race between hotplug and suspend Tianxiang Chen
0 siblings, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2026-04-07 11:50 UTC (permalink / raw)
To: Tianxiang Chen; +Cc: rafael, viresh.kumar, linux-pm, linux-kernel, lingyue
On Tue, Apr 7, 2026 at 11:35 AM Tianxiang Chen <nanmu@xiaomi.com> wrote:
>
> CPU hotplug operations can race with cpufreq_suspend()
> and cpufreq_resume(), leading to null pointer dereferences
> when accessing governor data.
So how exactly would CPU hotplug be started during a system suspend or resume?
> This occurs because there is
> no synchronization between suspend/resume operations and
> CPU hotplug, allowing concurrent access to
> policy->governor_data while it is being freed or initialized.
>
> Detailed race condition scenario:
>
> 1. Thread A (cpufreq_suspend) starts execution:
> - Iterates through active policies
> - Calls cpufreq_stop_governor(policy) for each policy
> - Sets cpufreq_suspended = true
>
> 2. Thread B (CPU hotplug) executes concurrently:
> - Calls cpu_down(cpu)
> - Calls cpuhp_cpufreq_offline(cpu)
> - Calls cpufreq_offline(cpu)
> - Inside cpufreq_offline():
> * Stops governor: policy->governor->stop(policy)
> * Exits governor: policy->governor->exit(policy)
> * Frees governor_data: kfree(policy->governor_data)
> * Sets policy->governor_data = NULL
>
> 3. Race window between step 1 and step 2:
> - Thread A is iterating policies and stopping governors
> - Thread B is concurrently executing CPU offline
> - Both threads may access the same policy->governor_data
> - Thread B frees governor_data while Thread A is still using it
> - Thread A accesses freed governor_data → null pointer dereference
>
> Similarly, cpufreq_resume() can race with CPU hotplug where governor_data
> is being initialized while hotplug is trying to access it, leading to
> accessing uninitialized data.
>
> Signed-off-by: Tianxiang Chen <nanmu@xiaomi.com>
> ---
> drivers/cpufreq/cpufreq.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 1f794524a1d9..8b03785764fa 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1979,6 +1979,7 @@ void cpufreq_suspend(void)
> if (!cpufreq_driver)
> return;
>
> + cpus_read_lock();
> if (!has_target() && !cpufreq_driver->suspend)
> goto suspend;
>
> @@ -1998,6 +1999,7 @@ void cpufreq_suspend(void)
>
> suspend:
> cpufreq_suspended = true;
> + cpus_read_unlock();
> }
>
> /**
> @@ -2017,10 +2019,11 @@ void cpufreq_resume(void)
> if (unlikely(!cpufreq_suspended))
> return;
>
> + cpus_read_lock();
> cpufreq_suspended = false;
>
> if (!has_target() && !cpufreq_driver->resume)
> - return;
> + goto out;
>
> pr_debug("%s: Resuming Governors\n", __func__);
>
> @@ -2038,6 +2041,9 @@ void cpufreq_resume(void)
> __func__, policy->cpu);
> }
> }
> +
> +out:
> + cpus_read_unlock();
> }
>
> /**
> --
> 2.34.1
>
> #/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cpufreq: fix race between hotplug and suspend
2026-04-07 11:50 ` Rafael J. Wysocki
@ 2026-04-08 1:46 ` Tianxiang Chen
2026-04-08 10:27 ` Rafael J. Wysocki
0 siblings, 1 reply; 5+ messages in thread
From: Tianxiang Chen @ 2026-04-08 1:46 UTC (permalink / raw)
To: rafael; +Cc: viresh.kumar, lingyue, linux-pm, linux-kernel
On Tue, 7 Apr 2026, Rafael J. Wysocki wrote:
> So how exactly would CPU hotplug be started during a system suspend or resume?
Hi Rafael,
Thank you for your question. Let me explain the two scenarios:
1. cpufreq_suspend() During Reboot (Confirmed Issue)
The real and reproducible race I encountered occurs during system reboot.
Call chain:
kernel_restart() -> kernel_restart_prepare()
-> device_shutdown() -> cpufreq_suspend()
Different from the regular suspend path, the reboot path does NOT call
freeze_processes() at all.
All userspace processes, drivers and kernel threads are
still running when cpufreq_suspend() executes. This allows CPU hotplug
(offline/online) operations to run concurrently with cpufreq_suspend().
2. System suspend/resume (Less Likely but Possible)
CPU hotplug is less likely during system suspend/resume. However,
non-freezable kernel threads may keep running throughout the entire
process, which may still trigger CPU hotplug in theory.
So I added cpus_read_lock()/cpus_read_unlock() to block CPU hotplug
while resume is in progress.
--
Thx and BRs,
Tianxiang Chen
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] cpufreq: fix race between hotplug and suspend
2026-04-08 1:46 ` [PATCH] cpufreq: fix race between hotplug and suspend Tianxiang Chen
@ 2026-04-08 10:27 ` Rafael J. Wysocki
2026-04-08 14:19 ` [PATCH v2] cpufreq: Fix hotplug-suspend race during reboot Tianxiang Chen
0 siblings, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2026-04-08 10:27 UTC (permalink / raw)
To: Tianxiang Chen; +Cc: rafael, viresh.kumar, lingyue, linux-pm, linux-kernel
On Wed, Apr 8, 2026 at 3:46 AM Tianxiang Chen <nanmu@xiaomi.com> wrote:
>
> On Tue, 7 Apr 2026, Rafael J. Wysocki wrote:
> > So how exactly would CPU hotplug be started during a system suspend or resume?
>
> Hi Rafael,
>
> Thank you for your question. Let me explain the two scenarios:
>
> 1. cpufreq_suspend() During Reboot (Confirmed Issue)
Which needs to be mentioned in the patch changelog.
> The real and reproducible race I encountered occurs during system reboot.
>
> Call chain:
> kernel_restart() -> kernel_restart_prepare()
> -> device_shutdown() -> cpufreq_suspend()
>
> Different from the regular suspend path, the reboot path does NOT call
> freeze_processes() at all.
That's correct.
> All userspace processes, drivers and kernel threads are
> still running when cpufreq_suspend() executes. This allows CPU hotplug
> (offline/online) operations to run concurrently with cpufreq_suspend().
>
> 2. System suspend/resume (Less Likely but Possible)
>
> CPU hotplug is less likely during system suspend/resume. However,
> non-freezable kernel threads may keep running throughout the entire
> process, which may still trigger CPU hotplug in theory.
Which would be a bug in the kernel thread in question. So not really.
> So I added cpus_read_lock()/cpus_read_unlock() to block CPU hotplug
> while resume is in progress.
Please resend the patch with a changelog actually mentioning the
failure that you have observed.
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2] cpufreq: Fix hotplug-suspend race during reboot
2026-04-08 10:27 ` Rafael J. Wysocki
@ 2026-04-08 14:19 ` Tianxiang Chen
0 siblings, 0 replies; 5+ messages in thread
From: Tianxiang Chen @ 2026-04-08 14:19 UTC (permalink / raw)
To: rafael; +Cc: viresh.kumar, lingyue, linux-pm, linux-kernel, Tianxiang Chen
During system reboot, cpufreq_suspend() is called via the
kernel_restart() -> device_shutdown() -> pm_notifier_call_chain()
path. Unlike the normal system suspend path, the reboot path does not
call freeze_processes(), so userspace processes and kernel threads
remain active.
This allows CPU hotplug operations to run concurrently with
cpufreq_suspend(). The original code has no synchronization with CPU
hotplug, leading to a race condition where governor_data can be freed
by the hotplug path while cpufreq_suspend() is still accessing it,
resulting in a null pointer dereference:
Unable to handle kernel NULL pointer dereference
Call Trace:
do_kernel_fault+0x28/0x3c
cpufreq_suspend+0xdc/0x160
device_shutdown+0x18/0x200
kernel_restart+0x40/0x80
arm64_sys_reboot+0x1b0/0x200
Fix this by adding cpus_read_lock()/cpus_read_unlock() to
cpufreq_suspend() to block CPU hotplug operations while suspend is in
progress.
Signed-off-by: Tianxiang Chen <nanmu@xiaomi.com>
---
v2:
- Update changelog to explicitly mention reboot scenario
- Add observed crash trace
---
drivers/cpufreq/cpufreq.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 1f794524a1d9..6f1d264c378b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1979,6 +1979,7 @@ void cpufreq_suspend(void)
if (!cpufreq_driver)
return;
+ cpus_read_lock();
if (!has_target() && !cpufreq_driver->suspend)
goto suspend;
@@ -1998,6 +1999,7 @@ void cpufreq_suspend(void)
suspend:
cpufreq_suspended = true;
+ cpus_read_unlock();
}
/**
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-04-08 14:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-07 9:35 [PATCH] cpufreq: Fix race between suspend/resume and CPU hotplug Tianxiang Chen
2026-04-07 11:50 ` Rafael J. Wysocki
2026-04-08 1:46 ` [PATCH] cpufreq: fix race between hotplug and suspend Tianxiang Chen
2026-04-08 10:27 ` Rafael J. Wysocki
2026-04-08 14:19 ` [PATCH v2] cpufreq: Fix hotplug-suspend race during reboot Tianxiang Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox