From: Breno Leitao <leitao@debian.org>
To: Petr Mladek <pmladek@suse.com>
Cc: Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Omar Sandoval <osandov@osandov.com>, Song Liu <song@kernel.org>,
Danielle Costantino <dcostantino@meta.com>,
kasan-dev@googlegroups.com, kernel-team@meta.com
Subject: Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics
Date: Thu, 11 Jun 2026 07:50:04 -0700 [thread overview]
Message-ID: <airKG4Hl8R-7sY_x@gmail.com> (raw)
In-Reply-To: <ab0kDS01bh5cK4KG@gmail.com>
On Fri, Mar 20, 2026 at 03:41:13AM -0700, Breno Leitao wrote:
> On Wed, Mar 18, 2026 at 04:11:54PM +0100, Petr Mladek wrote:
> > On Wed 2026-03-18 04:31:08, Breno Leitao wrote:
> > Otherwise, I like this patch.
> >
> > I still think what might be the reason that there is no worker
> > in the running state. Let's see if this patch brings some useful info.
> >
> > One more idea. It might be useful to store a timestamp when the last
> > worker was woken. And then print either the timestamp or delta.
> > It would help to make sure that kick_pool() was really called
> > during the reported stall.
>
> Ack, this is the following patch I will deploy in production, let's see
> how useful it is.
I got this running in production (backported to 6.16), and we finally got the culprit.
05:42:00 BUG: workqueue lockup - pool cpus=2 node=0 flags=0x0 nice=0 stuck for 115s!
NMI backtrace for cpu 2
CPU: 2 UID: 0 PID: 411 Comm: kworker/u288:2 Tainted: G O 6.16.1-0_fbk4_0_gb849430a436c #1 NONE
Tainted: [O]=OOT_MODULE
Hardware name: <foo>
Workqueue: efi_rts_wq efi_call_rts
pstate: 23401009 (nzCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--)
pc : 0x4052f10900
lr : 0x4052f10e94
sp : ffff800088cefc90
x29: ffff800088cefc90 x28: 0000000048524641 x27: 0000004052b60000
x26: 0000000000010058 x25: 0000004043ba0000 x24: 0000000001280000
x23: 000000405a02807f x22: 0000000000010080 x21: 0000004053ac0097
x20: 000000405a028080 x19: 0000004053ac0098 x18: 0000000000000000
x17: 0000000000000030 x16: 0000004052eb6de0 x15: 0000004042ba0030
x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000001
x11: 0000000001d00d09 x10: 0000004042ba0028 x9 : ffff800088cefc90
x8 : 0000000001d00cd9 x7 : 0000000000000000 x6 : 0000004043ba0000
x5 : 0000004043bb0000 x4 : 0000004053ac0098 x3 : 000000405a028080
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffffffffffe1e8
Call trace:
0x4052f10900 (P)
0x4052f10e94
0x4052b00ed0
0x4052b02e38
0x4052b0175c
0x4052b517b4
0x4052a70b84
0x4052cb11d4
__efi_rt_asm_wrapper+0x50/0x78
efi_call_rts+0x178/0x240
process_scheduled_works+0x17c/0x420
worker_thread+0x184/0x4d8
kthread+0xcc/0x1f8
ret_from_fork+0x10/0x20
05:42:30 BUG: workqueue lockup - pool cpus=2 node=0 flags=0x0 nice=0 stuck for 145s!
NMI backtrace for cpu 2
CPU: 2 UID: 0 PID: 411 Comm: kworker/u288:2 Tainted: G O 6.16.1-0_fbk4_0_gb849430a436c #1 NONE
Tainted: [O]=OOT_MODULE
Hardware name: <foo>
Workqueue: efi_rts_wq efi_call_rts
pstate: 63401009 (nZCv daif +PAN -UAO +TCO +DIT +SSBS BTYPE=--)
pc : 0x4052f11ecc
lr : 0x4052f10b8c
sp : ffff800088cefc30
x29: ffff800088cefc40 x28: 0000000048524641 x27: 0000004052b60000
x26: 0000000000010058 x25: 0000004043fb0000 x24: 0000000001690000
x23: 0000004053ab0040 x22: 0000000000010080 x21: ffff800088cefd00
rinse and repeat..
Unfortunately I didn't get the other pr_info(), because of console settings,
but, I can say the following from this issue and previous code:
1) in show_cpu_pool_hog, found_running variable is set to false.
2) hash_for_each() never found any running task
3) The following code was trigger and was very helpful:
if (!found_running)
trigger_single_cpu_backtrace(cpu);
next prev parent reply other threads:[~2026-06-11 14:50 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 16:15 [PATCH v2 0/5] workqueue: Detect stalled in-flight workers Breno Leitao
2026-03-05 16:15 ` [PATCH v2 1/5] workqueue: Use POOL_BH instead of WQ_BH when checking pool flags Breno Leitao
2026-03-05 17:13 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 2/5] workqueue: Rename pool->watchdog_ts to pool->last_progress_ts Breno Leitao
2026-03-05 17:16 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 3/5] workqueue: Show in-flight work item duration in stall diagnostics Breno Leitao
2026-03-05 17:17 ` Song Liu
2026-03-05 16:15 ` [PATCH v2 4/5] workqueue: Show all busy workers " Breno Leitao
2026-03-05 17:17 ` Song Liu
2026-03-12 17:03 ` Petr Mladek
2026-03-13 12:57 ` Breno Leitao
2026-03-13 16:27 ` Petr Mladek
2026-03-18 11:31 ` Breno Leitao
2026-03-18 15:11 ` Petr Mladek
2026-03-20 10:41 ` Breno Leitao
2026-06-11 14:50 ` Breno Leitao [this message]
2026-05-07 10:20 ` Jiri Slaby
2026-05-07 13:11 ` Breno Leitao
2026-05-11 5:21 ` Jiri Slaby
2026-05-13 7:29 ` Thorsten Leemhuis
2026-05-13 8:03 ` Jiri Slaby
2026-03-05 16:15 ` [PATCH v2 5/5] workqueue: Add stall detector sample module Breno Leitao
2026-03-05 17:25 ` Song Liu
2026-03-05 17:39 ` [PATCH v2 0/5] workqueue: Improve stall diagnostics Tejun Heo
2026-03-12 16:38 ` [PATCH v2 0/5] workqueue: Detect stalled in-flight workers Petr Mladek
2026-03-13 12:24 ` Breno Leitao
2026-03-13 14:38 ` Petr Mladek
2026-03-13 17:36 ` Breno Leitao
2026-03-18 16:46 ` Petr Mladek
2026-03-20 10:44 ` Breno Leitao
2026-05-13 8:57 ` Hillf Danton
2026-05-13 8:53 ` Markus Elfring
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=airKG4Hl8R-7sY_x@gmail.com \
--to=leitao@debian.org \
--cc=akpm@linux-foundation.org \
--cc=dcostantino@meta.com \
--cc=jiangshanlai@gmail.com \
--cc=kasan-dev@googlegroups.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=osandov@osandov.com \
--cc=pmladek@suse.com \
--cc=song@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.