* [PATCH v2 1/3] workqueue: only show running workers in stall diagnostics
2026-06-30 16:15 [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker Breno Leitao
@ 2026-06-30 16:15 ` Breno Leitao
2026-06-30 16:15 ` [PATCH v2 2/3] workqueue: trigger a single-CPU backtrace for stalled pools Breno Leitao
` (2 subsequent siblings)
3 siblings, 0 replies; 6+ messages in thread
From: Breno Leitao @ 2026-06-30 16:15 UTC (permalink / raw)
To: Tejun Heo, Lai Jiangshan, Song Liu
Cc: linux-kernel, pmladek, marco.crivellari, david.dai, Breno Leitao,
kernel-team
show_cpu_pool_busy_workers() dumps every in-flight worker in the pool's
busy_hash, including workers that are not currently running on the CPU.
Restore the task_is_running() filter so only running workers are dumped.
When no running worker is found the pool may be stuck, unable to wake an
idle worker to process pending work, and the watchdog would otherwise
give no feedback. Add show_pool_no_running_worker() to report the pool
id, CPU, idle state, and worker counts in that case.
The pool info message is printed inside pool->lock using
printk_deferred_enter/exit, the same pattern used by the existing
busy-worker loop, to avoid deadlocks with console drivers that queue
work while holding locks also taken in their write paths.
This has been running on the Meta fleet for a while and caught some real
issues, for instance EFI stalls stalling the workqueue [1].
Link: https://lore.kernel.org/all/20260616-efi_timeout-v3-0-76dd1d26657b@debian.org/ [1]
Suggested-by: Petr Mladek <pmladek@suse.com>
Fixes: 8823eaef45da7 ("workqueue: Show all busy workers in stall diagnostics")
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
kernel/workqueue.c | 38 ++++++++++++++++++++++++++++++++++----
1 file changed, 34 insertions(+), 4 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 78f25afb4a9d6..efbac160b7628 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -7693,13 +7693,31 @@ module_param_named(panic_on_stall_time, wq_panic_on_stall_time, uint, 0644);
MODULE_PARM_DESC(panic_on_stall_time, "Panic if stall exceeds this many seconds (0=disabled)");
/*
- * Show workers that might prevent the processing of pending work items.
- * A busy worker that is not running on the CPU (e.g. sleeping in
- * wait_event_idle() with PF_WQ_WORKER cleared) can stall the pool just as
- * effectively as a CPU-bound one, so dump every in-flight worker.
+ * Report that a pool has no worker in running state, which is a sign that the
+ * pool may be stuck. Print pool info. Must be called with pool->lock held and
+ * inside a printk_deferred_enter/exit region.
+ */
+static void show_pool_no_running_worker(struct worker_pool *pool)
+{
+ lockdep_assert_held(&pool->lock);
+
+ printk_deferred_enter();
+ pr_info("pool %d: no worker in running state, cpu=%d is %s (nr_workers=%d nr_idle=%d)\n",
+ pool->id, pool->cpu,
+ idle_cpu(pool->cpu) ? "idle" : "busy",
+ pool->nr_workers, pool->nr_idle);
+ pr_info("The pool might have trouble waking an idle worker.\n");
+ printk_deferred_exit();
+}
+
+/*
+ * Show running workers that might prevent the processing of pending work items.
+ * If no running worker is found, the pool may be stuck waiting for an idle
+ * worker to be woken, so report the pool state.
*/
static void show_cpu_pool_busy_workers(struct worker_pool *pool)
{
+ bool found_running = false;
struct worker *worker;
unsigned long irq_flags;
int bkt;
@@ -7707,6 +7725,11 @@ static void show_cpu_pool_busy_workers(struct worker_pool *pool)
raw_spin_lock_irqsave(&pool->lock, irq_flags);
hash_for_each(pool->busy_hash, bkt, worker, hentry) {
+ /* Skip workers that are not actively running on the CPU. */
+ if (!task_is_running(worker->task))
+ continue;
+
+ found_running = true;
/*
* Defer printing to avoid deadlocks in console
* drivers that queue work while holding locks
@@ -7720,6 +7743,13 @@ static void show_cpu_pool_busy_workers(struct worker_pool *pool)
printk_deferred_exit();
}
+ /*
+ * If no running worker was found, the pool is likely stuck. Print pool
+ * state.
+ */
+ if (!found_running)
+ show_pool_no_running_worker(pool);
+
raw_spin_unlock_irqrestore(&pool->lock, irq_flags);
}
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v2 2/3] workqueue: trigger a single-CPU backtrace for stalled pools
2026-06-30 16:15 [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker Breno Leitao
2026-06-30 16:15 ` [PATCH v2 1/3] workqueue: only show running workers in stall diagnostics Breno Leitao
@ 2026-06-30 16:15 ` Breno Leitao
2026-06-30 16:15 ` [PATCH v2 3/3] workqueue: dump the last woken worker " Breno Leitao
2026-06-30 16:54 ` [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker Tejun Heo
3 siblings, 0 replies; 6+ messages in thread
From: Breno Leitao @ 2026-06-30 16:15 UTC (permalink / raw)
To: Tejun Heo, Lai Jiangshan, Song Liu
Cc: linux-kernel, pmladek, marco.crivellari, david.dai, Breno Leitao,
kernel-team
When a CPU pool is stalled with no running worker, the task occupying the
CPU may not be a workqueue worker at all. Trigger a single-CPU backtrace
for the stalled CPU to capture what it is currently executing.
The CPU is snapshotted under pool->lock and the backtrace is triggered
after releasing the lock to avoid any potential issues with NMI delivery.
Skip the backtrace when the CPU is offline. A pool disassociated by CPU
hotplug keeps its pool->cpu, and an NMI to an offline CPU is never acked,
so nmi_trigger_cpumask_backtrace() would busy-wait for its full timeout
in the watchdog's timer context.
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
kernel/workqueue.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index efbac160b7628..7d30e23c84087 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -7720,10 +7720,13 @@ static void show_cpu_pool_busy_workers(struct worker_pool *pool)
bool found_running = false;
struct worker *worker;
unsigned long irq_flags;
- int bkt;
+ int cpu, bkt;
raw_spin_lock_irqsave(&pool->lock, irq_flags);
+ /* Snapshot cpu inside the lock to safely use it after unlock. */
+ cpu = pool->cpu;
+
hash_for_each(pool->busy_hash, bkt, worker, hentry) {
/* Skip workers that are not actively running on the CPU. */
if (!task_is_running(worker->task))
@@ -7751,6 +7754,15 @@ static void show_cpu_pool_busy_workers(struct worker_pool *pool)
show_pool_no_running_worker(pool);
raw_spin_unlock_irqrestore(&pool->lock, irq_flags);
+
+ /*
+ * Trigger a backtrace on the stalled CPU to capture what it is
+ * currently executing. Skip an offline CPU, whose NMI is never acked
+ * and would make the backtrace busy-wait until it times out. Done
+ * after releasing the lock to avoid issues with NMI delivery.
+ */
+ if (!found_running && cpu_online(cpu))
+ trigger_single_cpu_backtrace(cpu);
}
static void show_cpu_pools_busy_workers(void)
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v2 3/3] workqueue: dump the last woken worker for stalled pools
2026-06-30 16:15 [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker Breno Leitao
2026-06-30 16:15 ` [PATCH v2 1/3] workqueue: only show running workers in stall diagnostics Breno Leitao
2026-06-30 16:15 ` [PATCH v2 2/3] workqueue: trigger a single-CPU backtrace for stalled pools Breno Leitao
@ 2026-06-30 16:15 ` Breno Leitao
2026-06-30 16:54 ` [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker Tejun Heo
3 siblings, 0 replies; 6+ messages in thread
From: Breno Leitao @ 2026-06-30 16:15 UTC (permalink / raw)
To: Tejun Heo, Lai Jiangshan, Song Liu
Cc: linux-kernel, pmladek, marco.crivellari, david.dai, Breno Leitao,
kernel-team
To identify the task most likely responsible for a stall, add
last_woken_worker (L: pool->lock) to worker_pool and record it in
kick_pool() just before wake_up_process(). This captures the idle
worker that was kicked to take over when the last running worker went to
sleep; if the pool is now stuck with no running worker, that task is the
prime suspect and its backtrace is dumped by show_pool_no_running_worker().
Using struct worker * rather than struct task_struct * avoids any
lifetime concern: workers are only destroyed via set_worker_dying()
which requires pool->lock, and set_worker_dying() clears
last_woken_worker when the dying worker matches.
show_cpu_pool_busy_workers() holds pool->lock while calling
sched_show_task(), so last_woken_worker is either NULL or points to a
live worker with a valid task. More precisely, set_worker_dying() clears
last_woken_worker before setting WORKER_DIE, so a non-NULL
last_woken_worker means the kthread has not yet exited and worker->task
is still alive.
Suggested-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
kernel/workqueue.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7d30e23c84087..3580c19150721 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -226,6 +226,7 @@ struct worker_pool {
/* L: hash of busy workers */
struct worker *manager; /* L: purely informational */
+ struct worker *last_woken_worker; /* L: last worker woken by kick_pool() */
struct list_head workers; /* A: attached workers */
struct ida worker_ida; /* worker IDs for task name */
@@ -1310,6 +1311,9 @@ static bool kick_pool(struct worker_pool *pool)
}
}
#endif
+ /* Track the last idle worker woken, used for stall diagnostics. */
+ pool->last_woken_worker = worker;
+
wake_up_process(p);
return true;
}
@@ -2948,6 +2952,13 @@ static void set_worker_dying(struct worker *worker, struct list_head *list)
pool->nr_workers--;
pool->nr_idle--;
+ /*
+ * Clear last_woken_worker if it points to this worker, so that
+ * show_cpu_pool_busy_workers() cannot dereference a freed worker.
+ */
+ if (pool->last_woken_worker == worker)
+ pool->last_woken_worker = NULL;
+
worker->flags |= WORKER_DIE;
list_move(&worker->entry, list);
@@ -7707,13 +7718,25 @@ static void show_pool_no_running_worker(struct worker_pool *pool)
idle_cpu(pool->cpu) ? "idle" : "busy",
pool->nr_workers, pool->nr_idle);
pr_info("The pool might have trouble waking an idle worker.\n");
+ /*
+ * last_woken_worker and its task are valid here: set_worker_dying()
+ * clears it under pool->lock before setting WORKER_DIE, so if
+ * last_woken_worker is non-NULL the kthread has not yet exited and
+ * worker->task is still alive.
+ */
+ if (pool->last_woken_worker) {
+ pr_info("Backtrace of last woken worker:\n");
+ sched_show_task(pool->last_woken_worker->task);
+ } else {
+ pr_info("Last woken worker empty\n");
+ }
printk_deferred_exit();
}
/*
* Show running workers that might prevent the processing of pending work items.
* If no running worker is found, the pool may be stuck waiting for an idle
- * worker to be woken, so report the pool state.
+ * worker to be woken, so report the pool state and the last woken worker.
*/
static void show_cpu_pool_busy_workers(struct worker_pool *pool)
{
@@ -7748,7 +7771,8 @@ static void show_cpu_pool_busy_workers(struct worker_pool *pool)
/*
* If no running worker was found, the pool is likely stuck. Print pool
- * state.
+ * state and the backtrace of the last woken worker, which is the prime
+ * suspect for the stall.
*/
if (!found_running)
show_pool_no_running_worker(pool);
--
2.53.0-Meta
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker
2026-06-30 16:15 [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker Breno Leitao
` (2 preceding siblings ...)
2026-06-30 16:15 ` [PATCH v2 3/3] workqueue: dump the last woken worker " Breno Leitao
@ 2026-06-30 16:54 ` Tejun Heo
2026-06-30 16:59 ` Breno Leitao
3 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2026-06-30 16:54 UTC (permalink / raw)
To: Breno Leitao
Cc: Lai Jiangshan, Song Liu, linux-kernel, Petr Mladek,
marco.crivellari, david.dai, kernel-team
Hello, Breno.
Applied 1-2 to wq/for-7.3.
3 doesn't apply on for-7.3. Can you rebase and resend?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker
2026-06-30 16:54 ` [PATCH v2 0/3] workqueue: improve stall diagnostics for pools with no running worker Tejun Heo
@ 2026-06-30 16:59 ` Breno Leitao
0 siblings, 0 replies; 6+ messages in thread
From: Breno Leitao @ 2026-06-30 16:59 UTC (permalink / raw)
To: Tejun Heo
Cc: Lai Jiangshan, Song Liu, linux-kernel, Petr Mladek,
marco.crivellari, david.dai, kernel-team
Hello Tejun,
On Tue, Jun 30, 2026 at 06:54:21AM -1000, Tejun Heo wrote:
> Hello, Breno.
>
> Applied 1-2 to wq/for-7.3.
Thanks
> 3 doesn't apply on for-7.3. Can you rebase and resend?
Damn, I am sorry, my fault.
I will rebase and resend.
Thanks!
^ permalink raw reply [flat|nested] 6+ messages in thread