From: Hillf Danton <hdanton@sina.com>
To: Breno Leitao <leitao@debian.org>
Cc: Petr Mladek <pmladek@suse.com>, Tejun Heo <tj@kernel.org>,
linux-kernel@vger.kernel.org, Omar Sandoval <osandov@osandov.com>,
Danielle Costantino <dcostantino@meta.com>,
kasan-dev@googlegroups.com
Subject: Re: [PATCH v2 0/5] workqueue: Detect stalled in-flight workers
Date: Wed, 13 May 2026 16:57:24 +0800 [thread overview]
Message-ID: <20260513085725.597-1-hdanton@sina.com> (raw)
In-Reply-To: <abP8wDhYWwk3ufmA@gmail.com>
On Fri, 13 Mar 2026 05:24:54 -0700 Breno Leitao wrote:
> On Thu, Mar 12, 2026 at 05:38:26PM +0100, Petr Mladek wrote:
> > On Thu 2026-03-05 08:15:36, Breno Leitao wrote:
> > > There is a blind spot exists in the work queue stall detecetor (aka
> > > show_cpu_pool_hog()). It only prints workers whose task_is_running() is
> > > true, so a busy worker that is sleeping (e.g. wait_event_idle())
> > > produces an empty backtrace section even though it is the cause of the
> > > stall.
> > >
> > > Additionally, when the watchdog does report stalled pools, the output
> > > doesn't show how long each in-flight work item has been running, making
> > > it harder to identify which specific worker is stuck.
> > >
> > > Example of the sample code:
> > >
> > > BUG: workqueue lockup - pool cpus=4 node=0 flags=0x0 nice=0 stuck for 132s!
> > > Showing busy workqueues and worker pools:
> > > workqueue events: flags=0x100
> > > pwq 18: cpus=4 node=0 flags=0x0 nice=0 active=4 refcnt=5
> > > in-flight: 178:stall_work1_fn [wq_stall]
> > > pending: stall_work2_fn [wq_stall], free_obj_work, psi_avgs_work
> > > ...
> > > Showing backtraces of running workers in stalled
> > > CPU-bound worker pools:
> > > <nothing here>
> > >
> > > I see it happening on real machines, causing some stalls that doesn't
> > > have any backtrace. This is one of the code path:
> > >
> > > 1) kfence executes toggle_allocation_gate() as a delayed workqueue
> > > item (kfence_timer) on the system WQ.
> > >
> > > 2) toggle_allocation_gate() enables a static key, which IPIs every
> > > CPU to patch code:
> > > static_branch_enable(&kfence_allocation_key);
> > >
> > > 3) toggle_allocation_gate() then sleeps in TASK_IDLE waiting for a
> > > kfence allocation to occur:
> > > wait_event_idle(allocation_wait,
> > > atomic_read(&kfence_allocation_gate) > 0 || ...);
> > >
> > > This can last indefinitely if no allocation goes through the
> > > kfence path (or IPIing all the CPUs take longer, which is common on
> > > platforms that do not have NMI).
> > >
> > > The worker remains in the pool's busy_hash
> > > (in-flight) but is no longer task_is_running().
> > >
> > > 4) The workqueue watchdog detects the stall and calls
> > > show_cpu_pool_hog(), which only prints backtraces for workers
> > > that are actively running on CPU:
> > >
> > > static void show_cpu_pool_hog(struct worker_pool *pool) {
> > > ...
> > > if (task_is_running(worker->task))
> > > sched_show_task(worker->task);
> > > }
> > >
> > > 5) Nothing is printed because the offending worker is in TASK_IDLE
> > > state. The output shows "Showing backtraces of running workers in
> > > stalled CPU-bound worker pools:" followed by nothing, effectively
> > > hiding the actual culprit.
> >
> > I am trying to better understand the situation. There was a reason
> > why only the worker in the running state was shown.
> >
> > Normally, a sleeping worker should not cause a stall. The scheduler calls
> > wq_worker_sleeping() which should wake up another idle worker. There is
> > always at least one idle worker in the poll. It should start processing
> > the next pending work. Or it should fork another worker when it was
> > the last idle one.
>
> Right, but let's look at this case:
>
> BUG: workqueue lockup - pool 55 cpu 13 curr 0 (swapper/13) stack ffff800085640000 cpus=13 node=0 flags=0x0 nice=-20 stuck for 679s!
> work func=blk_mq_timeout_work data=0xffff0000ad7e3a05
> Showing busy workqueues and worker pools:
> workqueue events_unbound: flags=0x2
> pwq 288: cpus=0-71 flags=0x4 nice=0 active=1 refcnt=2
> in-flight: 4083734:btrfs_extent_map_shrinker_worker
> workqueue mm_percpu_wq: flags=0x8
> pwq 14: cpus=3 node=0 flags=0x0 nice=0 active=1 refcnt=2
> pending: vmstat_update
> pool 288: cpus=0-71 flags=0x4 nice=0 hung=0s workers=17 idle: 3800629 3959700 3554824 3706405 3759881 4065549 4041361 4065548 1715676 4086805 3860852 3587585 4065550 4014041 3944711 3744484
> Showing backtraces of running workers in stalled CPU-bound worker pools:
> # Nothing in here
>
> It seems CPU 13 is idle (curr = 0) and blk_mq_timeout_work has been pending for
> 679s ?
>
An idle CPU failed to process pending work, so the root cause lies outside
workqueue, and it is difficult to understand why giving more X-ray scan
to Peter helps if Paul has a bone in throat.
prev parent reply other threads:[~2026-05-13 8:58 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org>
[not found] ` <20260305-wqstall_start-at-v2-4-b60863ee0899@debian.org>
2026-05-07 10:20 ` [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics Jiri Slaby
2026-05-07 13:11 ` Breno Leitao
2026-05-11 5:21 ` Jiri Slaby
2026-05-13 7:29 ` Thorsten Leemhuis
2026-05-13 8:03 ` Jiri Slaby
2026-05-13 8:53 ` [PATCH v2 0/5] workqueue: Detect stalled in-flight workers Markus Elfring
[not found] ` <abLsAi7_fU5FrYiF@pathway.suse.cz>
[not found] ` <abP8wDhYWwk3ufmA@gmail.com>
2026-05-13 8:57 ` Hillf Danton [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260513085725.597-1-hdanton@sina.com \
--to=hdanton@sina.com \
--cc=dcostantino@meta.com \
--cc=kasan-dev@googlegroups.com \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=osandov@osandov.com \
--cc=pmladek@suse.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox