All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/3] workqueue: improve stall diagnostics for pools with no running worker
@ 2026-06-16 16:44 Breno Leitao
  2026-06-16 16:44 ` [PATCH RFC 1/3] workqueue: only show running workers in stall diagnostics Breno Leitao
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Breno Leitao @ 2026-06-16 16:44 UTC (permalink / raw)
  To: Tejun Heo, Lai Jiangshan, Song Liu
  Cc: linux-kernel, pmladek, Breno Leitao, kernel-team

The workqueue watchdog fires when a pool stops making progress, but the
diagnostics it printed did not always point at the culprit.

Commit 8823eaef45da7 ("workqueue: Show all busy workers in stall
diagnostics") made the watchdog dump every in-flight worker in the
stalled pool's busy_hash, including workers that are not running on the
CPU.  As Petr Mladek pointed out, that is rarely useful: a worker that is
merely sleeping inside a work item does not, by itself, hold up the pool
-- the scheduler calls wq_worker_sleeping() and the pool wakes (or forks)
another worker to keep going.  Dumping all of those sleeping workers
mostly adds noise and can do more harm than good.

The condition actually worth reporting is the opposite one: a pool that
is stalled with *no* running worker at all.  That means the pool failed
to get a worker onto the CPU -- it could not wake an idle worker, could
not fork a new one, or the CPU is busy running something that is not a
workqueue worker.  In that case the previous code printed an empty
backtrace section and gave no hint about what went wrong.

Following Petr's suggestion, this series reworks the diagnostics:

  1) workqueue: only show running workers in stall diagnostics
     Restore the task_is_running() filter (reverting the behaviour of
     8823eaef45da7) so only running workers are dumped, and explicitly
     report when a stalled pool has no running worker, printing the pool
     id, CPU, idle state and worker counts.

  2) workqueue: trigger a single-CPU backtrace for stalled pools
     Trigger a backtrace of the stalled CPU so whatever task is actually
     occupying it -- workqueue worker or not -- is captured.

  3) workqueue: dump the last woken worker for stalled pools
     Record the worker last woken by kick_pool() and dump its backtrace.
     It is the prime suspect: the idle worker kicked to take over when
     the previous running worker went to sleep.

The reworked diagnostics have been running on the Meta fleet (backported
to 6.16) and finally explained a long-standing arm64 stall that the old
output could not: an EFI runtime call wedging an efi_rts_wq worker on a
machine without NMI.  The single-CPU backtrace from patch 2 pinpointed
efi_call_rts() as the stuck task [1].

Example output, reproduced with the in-tree stall detector sample
(samples/workqueue/stall_detector/wq_stall.c); the lines added by this
series are marked "<-- new":

  BUG: workqueue lockup - pool cpus=2 node=0 flags=0x0 nice=0 stuck for 30s!
  Showing busy workqueues and worker pools:
  workqueue events: flags=0x100
    pwq 10: cpus=2 node=0 flags=0x0 nice=0 active=5 refcnt=6
      in-flight: 58:stall_work1_fn [wq_stall] for 30s
      pending: stall_work2_fn [wq_stall], ...
  pool 10: cpus=2 node=0 flags=0x0 nice=0 hung=30s workers=2 idle: 33
  Showing backtraces of busy workers in stalled worker pools:
  pool 10: no worker in running state, cpu=2 is idle (nr_workers=2 nr_idle=1)   <-- new
  The pool might have trouble waking an idle worker.                            <-- new
  Backtrace of last woken worker:                                              <-- new
  task:kworker/2:1     state:I  pid:58                                          <-- new
   __schedule+0x8fd/0xfc0                                                       <-- new
   stall_work1_fn+0xb2/0x100 [wq_stall]                                         <-- new
   process_scheduled_works+0x254/0x4e0                                          <-- new
   worker_thread+0x222/0x340                                                    <-- new
  Sending NMI from CPU 0 to CPUs 2:                                             <-- new
  NMI backtrace for cpu 2  (idle: default_idle / do_idle)                       <-- new

[1] https://lore.kernel.org/all/20260616-efi_timeout-v3-0-76dd1d26657b@debian.org/

Cc; marco.crivellari@suse.com

Signed-off-by: Breno Leitao <leitao@debian.org>
---
Breno Leitao (3):
      workqueue: only show running workers in stall diagnostics
      workqueue: trigger a single-CPU backtrace for stalled pools
      workqueue: dump the last woken worker for stalled pools

 kernel/workqueue.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 70 insertions(+), 5 deletions(-)
---
base-commit: 4fa3f5fabb30bf00d7475d5a33459ea83d639bf9
change-id: 20260616-wq_dump_petr-7fcf43940204

Best regards,
-- 
Breno Leitao <leitao@debian.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-06-19 15:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 16:44 [PATCH RFC 0/3] workqueue: improve stall diagnostics for pools with no running worker Breno Leitao
2026-06-16 16:44 ` [PATCH RFC 1/3] workqueue: only show running workers in stall diagnostics Breno Leitao
2026-06-19 12:58   ` Petr Mladek
2026-06-16 16:44 ` [PATCH RFC 2/3] workqueue: trigger a single-CPU backtrace for stalled pools Breno Leitao
2026-06-19 13:42   ` Petr Mladek
2026-06-16 16:44 ` [PATCH RFC 3/3] workqueue: dump the last woken worker " Breno Leitao
2026-06-19 15:40   ` Petr Mladek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.