* [PATCH v2] workqueue: add cmdline parameter workqueue.panic_on_stall
[not found] <CGME20240806051219epcas1p42de88463d90d6084ccfea538d929465c@epcas1p4.samsung.com>
@ 2024-08-06 5:12 ` Sangmoon Kim
2024-08-06 20:07 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Sangmoon Kim @ 2024-08-06 5:12 UTC (permalink / raw)
To: Tejun Heo
Cc: youngjae24.lim, jordan.lim, myoungjae.kim, Sangmoon Kim,
Lai Jiangshan, linux-kernel
When we want to debug the workqueue stall, we can immediately make
a panic to get the information we want.
In some systems, it may be necessary to quickly reboot the system to
escape from a workqueue lockup situation. In this case, we can control
the number of stall detections to generate panic.
workqueue.panic_on_stall sets the number times of the stall to trigger
panic. 0 disables the panic on stall.
Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
---
v2
- Combine 'panic_on_watchdog' and 'max_watchdog_to_panic' into
'panic_on_stall'
v1: https://lore.kernel.org/lkml/20240730080428.2556769-1-sangmoon.kim@samsung.com
---
kernel/workqueue.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index dfd42c28e404..801d984b68e5 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -7406,6 +7406,9 @@ static struct timer_list wq_watchdog_timer;
static unsigned long wq_watchdog_touched = INITIAL_JIFFIES;
static DEFINE_PER_CPU(unsigned long, wq_watchdog_touched_cpu) = INITIAL_JIFFIES;
+static unsigned int wq_panic_on_stall;
+module_param_named(panic_on_stall, wq_panic_on_stall, uint, 0644);
+
/*
* Show workers that might prevent the processing of pending work items.
* The only candidates are CPU-bound workers in the running state.
@@ -7457,6 +7460,16 @@ static void show_cpu_pools_hogs(void)
rcu_read_unlock();
}
+static void panic_on_wq_watchdog(void)
+{
+ static unsigned int wq_stall;
+
+ if (wq_panic_on_stall) {
+ wq_stall++;
+ BUG_ON(wq_stall >= wq_panic_on_stall);
+ }
+}
+
static void wq_watchdog_reset_touched(void)
{
int cpu;
@@ -7529,6 +7542,9 @@ static void wq_watchdog_timer_fn(struct timer_list *unused)
if (cpu_pool_stall)
show_cpu_pools_hogs();
+ if (lockup_detected)
+ panic_on_wq_watchdog();
+
wq_watchdog_reset_touched();
mod_timer(&wq_watchdog_timer, jiffies + thresh);
}
--
2.34.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH v2] workqueue: add cmdline parameter workqueue.panic_on_stall
2024-08-06 5:12 ` [PATCH v2] workqueue: add cmdline parameter workqueue.panic_on_stall Sangmoon Kim
@ 2024-08-06 20:07 ` Tejun Heo
2024-08-07 1:47 ` Sangmoon Kim
0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2024-08-06 20:07 UTC (permalink / raw)
To: Sangmoon Kim
Cc: youngjae24.lim, jordan.lim, myoungjae.kim, Lai Jiangshan,
linux-kernel
On Tue, Aug 06, 2024 at 02:12:09PM +0900, Sangmoon Kim wrote:
> When we want to debug the workqueue stall, we can immediately make
> a panic to get the information we want.
>
> In some systems, it may be necessary to quickly reboot the system to
> escape from a workqueue lockup situation. In this case, we can control
> the number of stall detections to generate panic.
>
> workqueue.panic_on_stall sets the number times of the stall to trigger
> panic. 0 disables the panic on stall.
>
> Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
Appled to wq/for-6.12. Can you do a follow-up patch to document it in
kernel-parameters?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: [PATCH v2] workqueue: add cmdline parameter workqueue.panic_on_stall
2024-08-06 20:07 ` Tejun Heo
@ 2024-08-07 1:47 ` Sangmoon Kim
0 siblings, 0 replies; 3+ messages in thread
From: Sangmoon Kim @ 2024-08-07 1:47 UTC (permalink / raw)
To: Tejun Heo
Cc: jiangshanlai, jordan.lim, linux-kernel, myoungjae.kim,
sangmoon.kim, youngjae24.lim
> -----Original Message-----
> From: Tejun Heo <htejun@gmail.com> On Behalf Of Tejun Heo
> Sent: Wednesday, August 7, 2024 5:08 AM
>
> On Tue, Aug 06, 2024 at 02:12:09PM +0900, Sangmoon Kim wrote:
> > When we want to debug the workqueue stall, we can immediately make
> > a panic to get the information we want.
> >
> > In some systems, it may be necessary to quickly reboot the system to
> > escape from a workqueue lockup situation. In this case, we can control
> > the number of stall detections to generate panic.
> >
> > workqueue.panic_on_stall sets the number times of the stall to trigger
> > panic. 0 disables the panic on stall.
> >
> > Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com>
>
> Appled to wq/for-6.12. Can you do a follow-up patch to document it in
> kernel-parameters?
>
> Thanks.
>
> --
> tejun
Okay. Let me prepare it.
Thanks
Sangmoon
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-08-07 1:48 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20240806051219epcas1p42de88463d90d6084ccfea538d929465c@epcas1p4.samsung.com>
2024-08-06 5:12 ` [PATCH v2] workqueue: add cmdline parameter workqueue.panic_on_stall Sangmoon Kim
2024-08-06 20:07 ` Tejun Heo
2024-08-07 1:47 ` Sangmoon Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox