public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cpu/hotplug: disallow writing any state in atomic AP section to sysfs target
@ 2024-12-20 14:15 Koichiro Den
  2025-01-16 13:21 ` Thomas Gleixner
  2025-01-24 15:33 ` Vishal Chourasia
  0 siblings, 2 replies; 4+ messages in thread
From: Koichiro Den @ 2024-12-20 14:15 UTC (permalink / raw)
  To: linux-kernel; +Cc: tglx, peterz

When CONFIG_CPU_HOTPLUG_STATE_CONTROL=y, writing a state within the
atomic AP section to 'hotplug/target' file for a fully online cpu can
cause a kernel crash [1]. This occurs because take_cpu_down() disables
the CPU, but the state machine does not reach CPUHP_AP_OFFLINE. As a
result, when cpu stopper thread finishes its work and idle task takes
over, cpuhp_report_idle_dead() crashes on 'BUG_ON(st->state !=
CPUHP_AP_OFFLINE)'.

In the opposite direction, start_secondary() assumes all startup
callbacks have been invoked and transitions to CPUHP_AP_ONLINE_IDLE,
regardless of the written target. This can result in some callbacks in
the section being silently skipped.

To address the issue, disable writing any state within the atomic AP
states to sysfs target. Additionally, set cant_stop to true for both
CPUHP_BP_KICK_AP (when CONFIG_HOTPLUG_SPLIT_STARTUP=y) and
CPUHP_AP_ONLINE since we do not automatically make the state machine
proceed to the other end of the atomic states.

[1]:

  # grep 'tick:dying' /sys/devices/system/cpu/hotplug/states
    143: tick:dying
  # cat /sys/devices/system/cpu/cpu7/hotplug/target
    238  # fully online
  # echo 143 > /sys/devices/system/cpu/cpu7/hotplug/target

    [  145.091832] ------------[ cut here ]------------
    [  145.092928] kernel BUG at kernel/cpu.c:1365!
    [  145.093960] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
    --(snip)--

Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
---
Previous attempt:
https://lore.kernel.org/all/20241207144721.2828390-1-koichiro.den@canonical.com/
---
 kernel/cpu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 34f1a09349fc..c877443f5888 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2127,6 +2127,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
 	[CPUHP_BP_KICK_AP] = {
 		.name			= "cpu:kick_ap",
 		.startup.single		= cpuhp_kick_ap_alive,
+		.cant_stop		= true,
 	},
 
 	/*
@@ -2192,6 +2193,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
 	 * state for synchronsization */
 	[CPUHP_AP_ONLINE] = {
 		.name			= "ap:online",
+		.cant_stop		= true,
 	},
 	/*
 	 * Handled on control processor until the plugged processor manages
@@ -2759,7 +2761,8 @@ static ssize_t target_store(struct device *dev, struct device_attribute *attr,
 		return ret;
 
 #ifdef CONFIG_CPU_HOTPLUG_STATE_CONTROL
-	if (target < CPUHP_OFFLINE || target > CPUHP_ONLINE)
+	if (target < CPUHP_OFFLINE || target > CPUHP_ONLINE ||
+	    cpuhp_is_atomic_state(target))
 		return -EINVAL;
 #else
 	if (target != CPUHP_OFFLINE && target != CPUHP_ONLINE)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-01-24 15:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-20 14:15 [PATCH] cpu/hotplug: disallow writing any state in atomic AP section to sysfs target Koichiro Den
2025-01-16 13:21 ` Thomas Gleixner
2025-01-18  7:40   ` Koichiro Den
2025-01-24 15:33 ` Vishal Chourasia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox