linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cpu: fix hard lockup triggered during stress-ng stress testing.
@ 2025-09-18  6:49 shechenglong
  2025-09-18 11:28 ` Catalin Marinas
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: shechenglong @ 2025-09-18  6:49 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will, linux-arm-kernel, linux-kernel, stone.xulei, chenjialong,
	yuxiating, shechenglong

Context of the Issue:
In an ARM64 environment, the following steps were performed:

1. Repeatedly ran stress-ng to stress the CPU, memory, and I/O.
2. Cyclically executed test case pty06 from the LTP test suite.
3. Added mitigations=off to the GRUB parameters.

After 1–2 hours of stress testing, a hardlockup occurred,
causing a system crash.

Root Cause of the Hardlockup:
Each time stress-ng starts, it invokes the /sys/kernel/debug/clear_warn_once
interface, which clears the values in the memory section from __start_once
to __end_once. This caused functions like pr_info_once() — originally
designed to print only once — to print again every time stress-ng was called.
If the pty06 test case happened to be using the serial module at that same
moment, it would sleep in waiter.list within the __down_common function.

After pr_info_once() completed its output using the serial module,
it invoked the semaphore up() function to wake up the process waiting
in waiter.list. This sequence triggered an A-A deadlock, ultimately
leading to a hardlockup and system crash.

To prevent this, a local variable should be used to control and ensure
the print operation occurs only once.

Hard lockup call stack:

_raw_spin_lock_nested+168
ttwu_queue+180 (rq_lock(rq, &rf); 2nd acquiring the rq->__lock)
try_to_wake_up+548
wake_up_process+32
__up+88
up+100
__up_console_sem+96
console_unlock+696
vprintk_emit+428
vprintk_default+64
vprintk_func+220
printk+104
spectre_v4_enable_task_mitigation+344
__switch_to+100
__schedule+1028 (rq_lock(rq, &rf); 1st acquiring the rq->__lock)
schedule_idle+48
do_idle+388
cpu_startup_entry+44
secondary_start_kernel+352

Signed-off-by: shechenglong <shechenglong@xfusion.com>
---
 arch/arm64/kernel/proton-pack.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
index edf1783ffc81..f8663157e041 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -424,8 +424,10 @@ static bool spectre_v4_mitigations_off(void)
 	bool ret = cpu_mitigations_off() ||
 		   __spectre_v4_policy == SPECTRE_V4_POLICY_MITIGATION_DISABLED;
 
-	if (ret)
-		pr_info_once("spectre-v4 mitigation disabled by command-line option\n");
+	static atomic_t __printk_once = ATOMIC_INIT(0);
+
+	if (ret && !atomic_cmpxchg(&__printk_once, 0, 1))
+		pr_info("spectre-v4 mitigation disabled by command-line option\n");
 
 	return ret;
 }
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2025-11-07 15:54 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-18  6:49 [PATCH] cpu: fix hard lockup triggered during stress-ng stress testing shechenglong
2025-09-18 11:28 ` Catalin Marinas
2025-09-19 12:05   ` 答复: " shechenglong
2025-09-22 16:54     ` Catalin Marinas
2025-09-22 16:08   ` Mark Rutland
2025-09-24 12:32 ` [PATCH] cpu: fix hard lockup triggered by printk calls within scheduling context shechenglong
2025-09-25 13:48   ` Catalin Marinas
2025-10-03 14:23   ` Will Deacon
2025-10-20 14:51 ` [PATCH v2 0/2] arm64: spectre: Fix hard lockup and cleanup mitigation messages shechenglong
2025-10-20 14:51   ` [PATCH v2 1/2] cpu:Remove the print when the CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY Kconfig option is disabled shechenglong
2025-10-20 14:51   ` [PATCH v2 2/2] cpu: fix hard lockup triggered by printk calls within scheduling context shechenglong
2025-10-29  3:45 ` [RESEND v2 0/2] arm64: spectre: Fix hard lockup and cleanup mitigation messages shechenglong
2025-10-29  3:45   ` [PATCH v2 1/2] cpu:Remove the print when the CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY Kconfig option is disabled shechenglong
2025-10-30 14:48     ` Will Deacon
2025-10-29  3:45   ` [PATCH v2 2/2] cpu: fix hard lockup triggered by printk calls within scheduling context shechenglong
2025-10-30 14:50     ` Will Deacon
2025-10-31  9:15 ` [PATCH v3 0/2] arm64: spectre: Fix hard lockup and cleanup mitigation messages shechenglong
2025-10-31  9:15   ` [PATCH v3 1/2] cpu:Remove the print when the CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY Kconfig option is disabled shechenglong
2025-10-31  9:15   ` [PATCH v3 2/2] cpu: fix hard lockup triggered by printk calls within scheduling context shechenglong
2025-11-07 15:53   ` [PATCH v3 0/2] arm64: spectre: Fix hard lockup and cleanup mitigation messages Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).