From: Breno Leitao <leitao@debian.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Omar Sandoval <osandov@osandov.com>, Song Liu <song@kernel.org>,
Danielle Costantino <dcostantino@meta.com>,
kasan-dev@googlegroups.com, kernel-team@meta.com
Subject: Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics
Date: Wed, 18 Mar 2026 04:31:08 -0700 [thread overview]
Message-ID: <abqLMy-ktHXRRQ1C@gmail.com> (raw)
In-Reply-To: <abQ6_FsxtfH8nXka@pathway>
Hello Petr,
On Fri, Mar 13, 2026 at 05:27:40PM +0100, Petr Mladek wrote:
> I took inspiration from your patch. This is what comes to my mind
> on top of the current master (printing only running workers):
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index aeaec79bc09c..a044c7e42139 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -7588,12 +7588,15 @@ static void show_cpu_pool_hog(struct worker_pool *pool)
> {
> struct worker *worker;
> unsigned long irq_flags;
> + bool found_running;
> int bkt;
>
> raw_spin_lock_irqsave(&pool->lock, irq_flags);
>
> + found_running = false;
> hash_for_each(pool->busy_hash, bkt, worker, hentry) {
> if (task_is_running(worker->task)) {
> + found_running = true;
> /*
> * Defer printing to avoid deadlocks in console
> * drivers that queue work while holding locks
> @@ -7609,6 +7612,19 @@ static void show_cpu_pool_hog(struct worker_pool *pool)
> }
>
> raw_spin_unlock_irqrestore(&pool->lock, irq_flags);
> +
> + if (!found_running) {
> + pr_info("pool %d: no worker in running state, cpu=%d is %s (nr_workers=%d nr_idle=%d)\n",
> + pool->id, pool->cpu,
> + idle_cpu(pool->cpu) ? "idle" : "busy",
> + pool->nr_workers, pool->nr_idle);
> + pr_info("The pool might have troubles to wake up another idle worker.\n");
> + if (pool->manager) {
> + pr_info("Backtrace of the pool manager:\n");
> + sched_show_task(pool->manager->task);
> + }
> + trigger_single_cpu_backtrace(pool->cpu);
> + }
> }
>
> static void show_cpu_pools_hogs(void)
>
>
> Warning: The code is not safe. We would need add some synchronization
> of the pool->manager pointer.
>
> Even better might be to print state and backtrace of the process
> which was woken by kick_pool() when the last running worker
> went asleep.
I agree. We should probably store the last woken worker in the worker_pool
structure and print it later.
I've spent some time verifying that the locking and lifecycle management are
correct. While I'm not completely certain, I believe it's getting closer. An
extra pair of eyes would be helpful.
This is the new version of this patch:
commit feccca7e696ead3272669ee4d4dc02b6946d0faf
Author: Breno Leitao <leitao@debian.org>
Date: Mon Mar 16 09:47:09 2026 -0700
workqueue: print diagnostic info when no worker is in running state
show_cpu_pool_busy_workers() iterates over busy workers but gives no
feedback when none are found in running state, which is a key indicator
that a pool may be stuck — unable to wake an idle worker to process
pending work.
Add a diagnostic message when no running workers are found, reporting
pool id, CPU, idle state, and worker counts. Also trigger a single-CPU
backtrace for the stalled CPU.
To identify the task most likely responsible for the stall, add
last_woken_worker (L: pool->lock) to worker_pool and record it in
kick_pool() just before wake_up_process(). This captures the idle
worker that was kicked to take over when the last running worker went to
sleep; if the pool is now stuck with no running worker, that task is the
prime suspect and its backtrace is dumped.
Using struct worker * rather than struct task_struct * avoids any
lifetime concern: workers are only destroyed via set_worker_dying()
which requires pool->lock, and set_worker_dying() clears
last_woken_worker when the dying worker matches. show_cpu_pool_busy_workers()
holds pool->lock while calling sched_show_task(), so last_woken_worker
is either NULL or points to a live worker with a valid task. More
precisely, set_worker_dying() clears last_woken_worker before setting
WORKER_DIE, so a non-NULL last_woken_worker means the kthread has not
yet exited and worker->task is still alive.
The pool info message is printed inside pool->lock using
printk_deferred_enter/exit, the same pattern used by the existing
busy-worker loop, to avoid deadlocks with console drivers that queue
work while holding locks also taken in their write paths.
trigger_single_cpu_backtrace() is called after releasing the lock.
Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index b77119d71641a..38aebf4514c03 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -217,6 +217,7 @@ struct worker_pool {
/* L: hash of busy workers */
struct worker *manager; /* L: purely informational */
+ struct worker *last_woken_worker; /* L: last worker woken by kick_pool() */
struct list_head workers; /* A: attached workers */
struct ida worker_ida; /* worker IDs for task name */
@@ -1295,6 +1296,9 @@ static bool kick_pool(struct worker_pool *pool)
}
}
#endif
+ /* Track the last idle worker woken, used for stall diagnostics. */
+ pool->last_woken_worker = worker;
+
wake_up_process(p);
return true;
}
@@ -2902,6 +2906,13 @@ static void set_worker_dying(struct worker *worker, struct list_head *list)
pool->nr_workers--;
pool->nr_idle--;
+ /*
+ * Clear last_woken_worker if it points to this worker, so that
+ * show_cpu_pool_busy_workers() cannot dereference a freed worker.
+ */
+ if (pool->last_woken_worker == worker)
+ pool->last_woken_worker = NULL;
+
worker->flags |= WORKER_DIE;
list_move(&worker->entry, list);
@@ -7582,20 +7593,58 @@ module_param_named(panic_on_stall_time, wq_panic_on_stall_time, uint, 0644);
MODULE_PARM_DESC(panic_on_stall_time, "Panic if stall exceeds this many seconds (0=disabled)");
/*
- * Show workers that might prevent the processing of pending work items.
- * A busy worker that is not running on the CPU (e.g. sleeping in
- * wait_event_idle() with PF_WQ_WORKER cleared) can stall the pool just as
- * effectively as a CPU-bound one, so dump every in-flight worker.
+ * Report that a pool has no worker in running state, which is a sign that the
+ * pool may be stuck. Print pool info. Must be called with pool->lock held and
+ * inside a printk_deferred_enter/exit region.
+ */
+static void show_pool_no_running_worker(struct worker_pool *pool)
+{
+ lockdep_assert_held(&pool->lock);
+
+ printk_deferred_enter();
+ pr_info("pool %d: no worker in running state, cpu=%d is %s (nr_workers=%d nr_idle=%d)\n",
+ pool->id, pool->cpu,
+ idle_cpu(pool->cpu) ? "idle" : "busy",
+ pool->nr_workers, pool->nr_idle);
+ pr_info("The pool might have trouble waking an idle worker.\n");
+ /*
+ * last_woken_worker and its task are valid here: set_worker_dying()
+ * clears it under pool->lock before setting WORKER_DIE, so if
+ * last_woken_worker is non-NULL the kthread has not yet exited and
+ * worker->task is still alive.
+ */
+ if (pool->last_woken_worker) {
+ pr_info("Backtrace of last woken worker:\n");
+ sched_show_task(pool->last_woken_worker->task);
+ } else {
+ pr_info("Last woken worker empty\n");
+ }
+ printk_deferred_exit();
+}
+
+/*
+ * Show running workers that might prevent the processing of pending work items.
+ * If no running worker is found, the pool may be stuck waiting for an idle
+ * worker to be woken, so report the pool state and the last woken worker.
*/
static void show_cpu_pool_busy_workers(struct worker_pool *pool)
{
struct worker *worker;
unsigned long irq_flags;
- int bkt;
+ bool found_running = false;
+ int cpu, bkt;
raw_spin_lock_irqsave(&pool->lock, irq_flags);
+ /* Snapshot cpu inside the lock to safely use it after unlock. */
+ cpu = pool->cpu;
+
hash_for_each(pool->busy_hash, bkt, worker, hentry) {
+ /* Skip workers that are not actively running on the CPU. */
+ if (!task_is_running(worker->task))
+ continue;
+
+ found_running = true;
/*
* Defer printing to avoid deadlocks in console
* drivers that queue work while holding locks
@@ -7609,7 +7658,23 @@ static void show_cpu_pool_busy_workers(struct worker_pool *pool)
printk_deferred_exit();
}
+ /*
+ * If no running worker was found, the pool is likely stuck. Print pool
+ * state and the backtrace of the last woken worker, which is the prime
+ * suspect for the stall.
+ */
+ if (!found_running)
+ show_pool_no_running_worker(pool);
+
raw_spin_unlock_irqrestore(&pool->lock, irq_flags);
+
+ /*
+ * Trigger a backtrace on the stalled CPU to capture what it is
+ * currently executing. Called after releasing the lock to avoid
+ * any potential issues with NMI delivery.
+ */
+ if (!found_running)
+ trigger_single_cpu_backtrace(cpu);
}
static void show_cpu_pools_busy_workers(void)
next prev parent reply other threads:[~2026-03-18 11:31 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 16:15 [PATCH v2 0/5] workqueue: Detect stalled in-flight workers Breno Leitao
2026-03-05 16:15 ` [PATCH v2 1/5] workqueue: Use POOL_BH instead of WQ_BH when checking pool flags Breno Leitao
2026-03-05 17:13 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 2/5] workqueue: Rename pool->watchdog_ts to pool->last_progress_ts Breno Leitao
2026-03-05 17:16 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 3/5] workqueue: Show in-flight work item duration in stall diagnostics Breno Leitao
2026-03-05 17:17 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 4/5] workqueue: Show all busy workers " Breno Leitao
2026-03-05 17:17 ` Song Liu
2026-03-12 17:03 ` Petr Mladek
2026-03-13 12:57 ` Breno Leitao
2026-03-13 16:27 ` Petr Mladek
2026-03-18 11:31 ` Breno Leitao [this message]
2026-03-18 15:11 ` Petr Mladek
2026-03-20 10:41 ` Breno Leitao
2026-05-07 10:20 ` Jiri Slaby
2026-05-07 13:11 ` Breno Leitao
2026-05-11 5:21 ` Jiri Slaby
2026-05-13 7:29 ` Thorsten Leemhuis
2026-05-13 8:03 ` Jiri Slaby
2026-03-05 16:15 ` [PATCH v2 5/5] workqueue: Add stall detector sample module Breno Leitao
2026-03-05 17:25 ` Song Liu
2026-03-05 17:39 ` [PATCH v2 0/5] workqueue: Improve stall diagnostics Tejun Heo
2026-03-12 16:38 ` [PATCH v2 0/5] workqueue: Detect stalled in-flight workers Petr Mladek
2026-03-13 12:24 ` Breno Leitao
2026-03-13 14:38 ` Petr Mladek
2026-03-13 17:36 ` Breno Leitao
2026-03-18 16:46 ` Petr Mladek
2026-03-20 10:44 ` Breno Leitao
2026-05-13 8:57 ` Hillf Danton
2026-05-13 8:53 ` Markus Elfring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abqLMy-ktHXRRQ1C@gmail.com \
--to=leitao@debian.org \
--cc=akpm@linux-foundation.org \
--cc=dcostantino@meta.com \
--cc=jiangshanlai@gmail.com \
--cc=kasan-dev@googlegroups.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=osandov@osandov.com \
--cc=pmladek@suse.com \
--cc=song@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.