From: Breno Leitao <leitao@debian.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Omar Sandoval <osandov@osandov.com>, Song Liu <song@kernel.org>,
Danielle Costantino <dcostantino@meta.com>,
kasan-dev@googlegroups.com, kernel-team@meta.com
Subject: Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics
Date: Wed, 18 Mar 2026 04:31:08 -0700 [thread overview]
Message-ID: <abqLMy-ktHXRRQ1C@gmail.com> (raw)
In-Reply-To: <abQ6_FsxtfH8nXka@pathway>
Hello Petr,
On Fri, Mar 13, 2026 at 05:27:40PM +0100, Petr Mladek wrote:
> I took inspiration from your patch. This is what comes to my mind
> on top of the current master (printing only running workers):
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index aeaec79bc09c..a044c7e42139 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -7588,12 +7588,15 @@ static void show_cpu_pool_hog(struct worker_pool *pool)
> {
> struct worker *worker;
> unsigned long irq_flags;
> + bool found_running;
> int bkt;
>
> raw_spin_lock_irqsave(&pool->lock, irq_flags);
>
> + found_running = false;
> hash_for_each(pool->busy_hash, bkt, worker, hentry) {
> if (task_is_running(worker->task)) {
> + found_running = true;
> /*
> * Defer printing to avoid deadlocks in console
> * drivers that queue work while holding locks
> @@ -7609,6 +7612,19 @@ static void show_cpu_pool_hog(struct worker_pool *pool)
> }
>
> raw_spin_unlock_irqrestore(&pool->lock, irq_flags);
> +
> + if (!found_running) {
> + pr_info("pool %d: no worker in running state, cpu=%d is %s (nr_workers=%d nr_idle=%d)\n",
> + pool->id, pool->cpu,
> + idle_cpu(pool->cpu) ? "idle" : "busy",
> + pool->nr_workers, pool->nr_idle);
> + pr_info("The pool might have troubles to wake up another idle worker.\n");
> + if (pool->manager) {
> + pr_info("Backtrace of the pool manager:\n");
> + sched_show_task(pool->manager->task);
> + }
> + trigger_single_cpu_backtrace(pool->cpu);
> + }
> }
>
> static void show_cpu_pools_hogs(void)
>
>
> Warning: The code is not safe. We would need add some synchronization
> of the pool->manager pointer.
>
> Even better might be to print state and backtrace of the process
> which was woken by kick_pool() when the last running worker
> went asleep.
I agree. We should probably store the last woken worker in the worker_pool
structure and print it later.
I've spent some time verifying that the locking and lifecycle management are
correct. While I'm not completely certain, I believe it's getting closer. An
extra pair of eyes would be helpful.
This is the new version of this patch:
commit feccca7e696ead3272669ee4d4dc02b6946d0faf
Author: Breno Leitao <leitao@debian.org>
Date: Mon Mar 16 09:47:09 2026 -0700
workqueue: print diagnostic info when no worker is in running state
show_cpu_pool_busy_workers() iterates over busy workers but gives no
feedback when none are found in running state, which is a key indicator
that a pool may be stuck — unable to wake an idle worker to process
pending work.
Add a diagnostic message when no running workers are found, reporting
pool id, CPU, idle state, and worker counts. Also trigger a single-CPU
backtrace for the stalled CPU.
To identify the task most likely responsible for the stall, add
last_woken_worker (L: pool->lock) to worker_pool and record it in
kick_pool() just before wake_up_process(). This captures the idle
worker that was kicked to take over when the last running worker went to
sleep; if the pool is now stuck with no running worker, that task is the
prime suspect and its backtrace is dumped.
Using struct worker * rather than struct task_struct * avoids any
lifetime concern: workers are only destroyed via set_worker_dying()
which requires pool->lock, and set_worker_dying() clears
last_woken_worker when the dying worker matches. show_cpu_pool_busy_workers()
holds pool->lock while calling sched_show_task(), so last_woken_worker
is either NULL or points to a live worker with a valid task. More
precisely, set_worker_dying() clears last_woken_worker before setting
WORKER_DIE, so a non-NULL last_woken_worker means the kthread has not
yet exited and worker->task is still alive.
The pool info message is printed inside pool->lock using
printk_deferred_enter/exit, the same pattern used by the existing
busy-worker loop, to avoid deadlocks with console drivers that queue
work while holding locks also taken in their write paths.
trigger_single_cpu_backtrace() is called after releasing the lock.
Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b77119d71641a..38aebf4514c03 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -217,6 +217,7 @@ struct worker_pool {
/* L: hash of busy workers */
struct worker *manager; /* L: purely informational */
+ struct worker *last_woken_worker; /* L: last worker woken by kick_pool() */
struct list_head workers; /* A: attached workers */
struct ida worker_ida; /* worker IDs for task name */
@@ -1295,6 +1296,9 @@ static bool kick_pool(struct worker_pool *pool)
}
}
#endif
+ /* Track the last idle worker woken, used for stall diagnostics. */
+ pool->last_woken_worker = worker;
+
wake_up_process(p);
return true;
}
@@ -2902,6 +2906,13 @@ static void set_worker_dying(struct worker *worker, struct list_head *list)
pool->nr_workers--;
pool->nr_idle--;
+ /*
+ * Clear last_woken_worker if it points to this worker, so that
+ * show_cpu_pool_busy_workers() cannot dereference a freed worker.
+ */
+ if (pool->last_woken_worker == worker)
+ pool->last_woken_worker = NULL;
+
worker->flags |= WORKER_DIE;
list_move(&worker->entry, list);
@@ -7582,20 +7593,58 @@ module_param_named(panic_on_stall_time, wq_panic_on_stall_time, uint, 0644);
MODULE_PARM_DESC(panic_on_stall_time, "Panic if stall exceeds this many seconds (0=disabled)");
/*
- * Show workers that might prevent the processing of pending work items.
- * A busy worker that is not running on the CPU (e.g. sleeping in
- * wait_event_idle() with PF_WQ_WORKER cleared) can stall the pool just as
- * effectively as a CPU-bound one, so dump every in-flight worker.
+ * Report that a pool has no worker in running state, which is a sign that the
+ * pool may be stuck. Print pool info. Must be called with pool->lock held and
+ * inside a printk_deferred_enter/exit region.
+ */
+static void show_pool_no_running_worker(struct worker_pool *pool)
+{
+ lockdep_assert_held(&pool->lock);
+
+ printk_deferred_enter();
+ pr_info("pool %d: no worker in running state, cpu=%d is %s (nr_workers=%d nr_idle=%d)\n",
+ pool->id, pool->cpu,
+ idle_cpu(pool->cpu) ? "idle" : "busy",
+ pool->nr_workers, pool->nr_idle);
+ pr_info("The pool might have trouble waking an idle worker.\n");
+ /*
+ * last_woken_worker and its task are valid here: set_worker_dying()
+ * clears it under pool->lock before setting WORKER_DIE, so if
+ * last_woken_worker is non-NULL the kthread has not yet exited and
+ * worker->task is still alive.
+ */
+ if (pool->last_woken_worker) {
+ pr_info("Backtrace of last woken worker:\n");
+ sched_show_task(pool->last_woken_worker->task);
+ } else {
+ pr_info("Last woken worker empty\n");
+ }
+ printk_deferred_exit();
+}
+
+/*
+ * Show running workers that might prevent the processing of pending work items.
+ * If no running worker is found, the pool may be stuck waiting for an idle
+ * worker to be woken, so report the pool state and the last woken worker.
*/
static void show_cpu_pool_busy_workers(struct worker_pool *pool)
{
struct worker *worker;
unsigned long irq_flags;
- int bkt;
+ bool found_running = false;
+ int cpu, bkt;
raw_spin_lock_irqsave(&pool->lock, irq_flags);
+ /* Snapshot cpu inside the lock to safely use it after unlock. */
+ cpu = pool->cpu;
+
hash_for_each(pool->busy_hash, bkt, worker, hentry) {
+ /* Skip workers that are not actively running on the CPU. */
+ if (!task_is_running(worker->task))
+ continue;
+
+ found_running = true;
/*
* Defer printing to avoid deadlocks in console
* drivers that queue work while holding locks
@@ -7609,7 +7658,23 @@ static void show_cpu_pool_busy_workers(struct worker_pool *pool)
printk_deferred_exit();
}
+ /*
+ * If no running worker was found, the pool is likely stuck. Print pool
+ * state and the backtrace of the last woken worker, which is the prime
+ * suspect for the stall.
+ */
+ if (!found_running)
+ show_pool_no_running_worker(pool);
+
raw_spin_unlock_irqrestore(&pool->lock, irq_flags);
+
+ /*
+ * Trigger a backtrace on the stalled CPU to capture what it is
+ * currently executing. Called after releasing the lock to avoid
+ * any potential issues with NMI delivery.
+ */
+ if (!found_running)
+ trigger_single_cpu_backtrace(cpu);
}
static void show_cpu_pools_busy_workers(void)
next prev parent reply other threads:[~2026-03-18 11:31 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 16:15 [PATCH v2 0/5] workqueue: Detect stalled in-flight workers Breno Leitao
2026-03-05 16:15 ` [PATCH v2 1/5] workqueue: Use POOL_BH instead of WQ_BH when checking pool flags Breno Leitao
2026-03-05 17:13 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 2/5] workqueue: Rename pool->watchdog_ts to pool->last_progress_ts Breno Leitao
2026-03-05 17:16 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 3/5] workqueue: Show in-flight work item duration in stall diagnostics Breno Leitao
2026-03-05 17:17 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 4/5] workqueue: Show all busy workers " Breno Leitao
2026-03-05 17:17 ` Song Liu
2026-03-12 17:03 ` Petr Mladek
2026-03-13 12:57 ` Breno Leitao
2026-03-13 16:27 ` Petr Mladek
2026-03-18 11:31 ` Breno Leitao [this message]
2026-03-18 15:11 ` Petr Mladek
2026-03-20 10:41 ` Breno Leitao
2026-03-05 16:15 ` [PATCH v2 5/5] workqueue: Add stall detector sample module Breno Leitao
2026-03-05 17:25 ` Song Liu
2026-03-05 17:39 ` [PATCH v2 0/5] workqueue: Improve stall diagnostics Tejun Heo
2026-03-12 16:38 ` [PATCH v2 0/5] workqueue: Detect stalled in-flight workers Petr Mladek
2026-03-13 12:24 ` Breno Leitao
2026-03-13 14:38 ` Petr Mladek
2026-03-13 17:36 ` Breno Leitao
2026-03-18 16:46 ` Petr Mladek
2026-03-20 10:44 ` Breno Leitao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abqLMy-ktHXRRQ1C@gmail.com \
--to=leitao@debian.org \
--cc=akpm@linux-foundation.org \
--cc=dcostantino@meta.com \
--cc=jiangshanlai@gmail.com \
--cc=kasan-dev@googlegroups.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=osandov@osandov.com \
--cc=pmladek@suse.com \
--cc=song@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox