Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

* Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics
       [not found] ` <20260305-wqstall_start-at-v2-4-b60863ee0899@debian.org>
@ 2026-05-07 10:20   ` Jiri Slaby
  2026-05-07 13:11     ` Breno Leitao
  0 siblings, 1 reply; 7+ messages in thread
From: Jiri Slaby @ 2026-05-07 10:20 UTC (permalink / raw)
  To: Breno Leitao, Tejun Heo, Lai Jiangshan, Andrew Morton
  Cc: linux-kernel, Omar Sandoval, Song Liu, Danielle Costantino,
	kasan-dev, Petr Mladek, kernel-team

On 05. 03. 26, 17:15, Breno Leitao wrote:
> show_cpu_pool_hog() only prints workers whose task is currently running
> on the CPU (task_is_running()).  This misses workers that are busy
> processing a work item but are sleeping or blocked — for example, a
> worker that clears PF_WQ_WORKER and enters wait_event_idle().  Such a
> worker still occupies a pool slot and prevents progress, yet produces
> an empty backtrace section in the watchdog output.
> 
> This is happening on real arm64 systems, where
> toggle_allocation_gate() IPIs every single CPU in the machine (which
> lacks NMI), causing workqueue stalls that show empty backtraces because
> toggle_allocation_gate() is sleeping in wait_event_idle().
> 
> Remove the task_is_running() filter so every in-flight worker in the
> pool's busy_hash is dumped.  The busy_hash is protected by pool->lock,
> which is already held.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>   kernel/workqueue.c | 28 +++++++++++++---------------
>   1 file changed, 13 insertions(+), 15 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 56d8af13843f8..09b9ad78d566c 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -7583,9 +7583,9 @@ MODULE_PARM_DESC(panic_on_stall_time, "Panic if stall exceeds this many seconds
>   
>   /*
>    * Show workers that might prevent the processing of pending work items.
> - * The only candidates are CPU-bound workers in the running state.
> - * Pending work items should be handled by another idle worker
> - * in all other situations.
> + * A busy worker that is not running on the CPU (e.g. sleeping in
> + * wait_event_idle() with PF_WQ_WORKER cleared) can stall the pool just as
> + * effectively as a CPU-bound one, so dump every in-flight worker.
>    */
>   static void show_cpu_pool_hog(struct worker_pool *pool)
>   {
> @@ -7596,19 +7596,17 @@ static void show_cpu_pool_hog(struct worker_pool *pool)
>   	raw_spin_lock_irqsave(&pool->lock, irq_flags);
>   
>   	hash_for_each(pool->busy_hash, bkt, worker, hentry) {
> -		if (task_is_running(worker->task)) {

We see dumps from non-existent cpus on 7.0 like:
   BUG: workqueue lockup - pool cpus=144 node=0 flags=0x4 nice=0 stuck 
for 168224s!
...
   Showing busy workqueues and worker pools:
   workqueue rcu_gp: flags=0x108
     pwq 578: cpus=144 node=0 flags=0x4 nice=0 active=3 refcnt=4
in:
   https://bugzilla.suse.com/show_bug.cgi?id=1263947
?

Can this (or other patch from the series) cause this? Should there be 
something like cpu_online() instead of task_is_running() somewhere?

> -			/*
> -			 * Defer printing to avoid deadlocks in console
> -			 * drivers that queue work while holding locks
> -			 * also taken in their write paths.
> -			 */
> -			printk_deferred_enter();
> +		/*
> +		 * Defer printing to avoid deadlocks in console
> +		 * drivers that queue work while holding locks
> +		 * also taken in their write paths.
> +		 */
> +		printk_deferred_enter();
>   
> -			pr_info("pool %d:\n", pool->id);
> -			sched_show_task(worker->task);
> +		pr_info("pool %d:\n", pool->id);
> +		sched_show_task(worker->task);
>   
> -			printk_deferred_exit();
> -		}
> +		printk_deferred_exit();
>   	}
>   
>   	raw_spin_unlock_irqrestore(&pool->lock, irq_flags);



thanks,
-- 
js
suse labs


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics
  2026-05-07 10:20   ` [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics Jiri Slaby
@ 2026-05-07 13:11     ` Breno Leitao
  2026-05-11  5:21       ` Jiri Slaby
  0 siblings, 1 reply; 7+ messages in thread
From: Breno Leitao @ 2026-05-07 13:11 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Tejun Heo, Lai Jiangshan, Andrew Morton, linux-kernel,
	Omar Sandoval, Song Liu, Danielle Costantino, kasan-dev,
	Petr Mladek, kernel-team

Hi Jiri,

On Thu, May 07, 2026 at 12:20:33PM +0200, Jiri Slaby wrote:
> On 05. 03. 26, 17:15, Breno Leitao wrote:
>
>   BUG: workqueue lockup - pool cpus=144 node=0 flags=0x4 nice=0 stuck for
> 168224s!

That's an extremely long stall (~1.95 days).

> ...
>   Showing busy workqueues and worker pools:
>   workqueue rcu_gp: flags=0x108
>     pwq 578: cpus=144 node=0 flags=0x4 nice=0 active=3 refcnt=4
> in:
>   https://bugzilla.suse.com/show_bug.cgi?id=1263947
> ?
>
> Can this (or other patch from the series) cause this? Should there be
> something like cpu_online() instead of task_is_running() somewhere?

This series only affects stall reporting, not detection. The changes run
after the watchdog has identified a stall, so the detection logic itself
remains unchanged.

To help diagnose this issue, could you provide some additional information:

1) Was CPU 144 online at any point? If so, when was it taken offline?
2) Does this message appear repeatedly? If you bring CPU 144 online, does
   the issue resolve?
3) Have you run similar tests on earlier kernel versions without seeing
   this behavior, or is this a clear regression?

Thanks for the report,
--breno

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics
  2026-05-07 13:11     ` Breno Leitao
@ 2026-05-11  5:21       ` Jiri Slaby
  2026-05-13  7:29         ` Thorsten Leemhuis
  0 siblings, 1 reply; 7+ messages in thread
From: Jiri Slaby @ 2026-05-11  5:21 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Tejun Heo, Lai Jiangshan, Andrew Morton, linux-kernel,
	Omar Sandoval, Song Liu, Danielle Costantino, kasan-dev,
	Petr Mladek, kernel-team

Hi,

we currently have several reports of this. On s390, ppc64, and x86_64.

On 07. 05. 26, 15:11, Breno Leitao wrote:
> Hi Jiri,
> 
> On Thu, May 07, 2026 at 12:20:33PM +0200, Jiri Slaby wrote:
>> On 05. 03. 26, 17:15, Breno Leitao wrote:
>>
>>    BUG: workqueue lockup - pool cpus=144 node=0 flags=0x4 nice=0 stuck for
>> 168224s!
> 
> That's an extremely long stall (~1.95 days).
> 
>> ...
>>    Showing busy workqueues and worker pools:
>>    workqueue rcu_gp: flags=0x108
>>      pwq 578: cpus=144 node=0 flags=0x4 nice=0 active=3 refcnt=4
>> in:
>>    https://bugzilla.suse.com/show_bug.cgi?id=1263947
>> ?
>>
>> Can this (or other patch from the series) cause this? Should there be
>> something like cpu_online() instead of task_is_running() somewhere?
> 
> This series only affects stall reporting, not detection. The changes run
> after the watchdog has identified a stall, so the detection logic itself
> remains unchanged.
> 
> To help diagnose this issue, could you provide some additional information:
> 
> 1) Was CPU 144 online at any point? If so, when was it taken offline?

It was not, it's non-present.

> 2) Does this message appear repeatedly? If you bring CPU 144 online, does
>     the issue resolve?

Yes, look at this new x86_64 report's dmesg (I believe it is related to 
the above report):
   BUG: workqueue lockup - pool cpus=2 node=0 flags=0x4 nice=0 stuck for 
50s!
in:
   https://bugzilla.suse.com/attachment.cgi?id=890229

$ grep -c BUG sl.txt
504
$ grep -c pwq sl.txt
509

It comes from:
https://bugzilla.suse.com/show_bug.cgi?id=1264554

> 3) Have you run similar tests on earlier kernel versions without seeing
>     this behavior, or is this a clear regression?

It's new in 7.0. Going back to 6.19.12 makes it disappear.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics
  2026-05-11  5:21       ` Jiri Slaby
@ 2026-05-13  7:29         ` Thorsten Leemhuis
  2026-05-13  8:03           ` Jiri Slaby
  0 siblings, 1 reply; 7+ messages in thread
From: Thorsten Leemhuis @ 2026-05-13  7:29 UTC (permalink / raw)
  To: Jiri Slaby, Breno Leitao
  Cc: Tejun Heo, Lai Jiangshan, Andrew Morton, linux-kernel,
	Omar Sandoval, Song Liu, Danielle Costantino, kasan-dev,
	Petr Mladek, kernel-team, Linux kernel regressions list

On 5/11/26 07:21, Jiri Slaby wrote:
> we currently have several reports of this. On s390, ppc64, and x86_64.

I stumbled on this by accident and this is not my area of expertise, so
the following might be bogus:

Is this maybe the same as "Observed Workqueue lockups on offline CPUs.":
https://lore.kernel.org/lkml/97a7d011-d573-4754-9e5d-68b562c64089@linux.ibm.com/

Fix is here:
https://lore.kernel.org/lkml/20260508174353.905746-1-paulmck@kernel.org/

Ciao, Thorsten

> On 07. 05. 26, 15:11, Breno Leitao wrote:
>> Hi Jiri,
>>
>> On Thu, May 07, 2026 at 12:20:33PM +0200, Jiri Slaby wrote:
>>> On 05. 03. 26, 17:15, Breno Leitao wrote:
>>>
>>>    BUG: workqueue lockup - pool cpus=144 node=0 flags=0x4 nice=0
>>> stuck for
>>> 168224s!
>>
>> That's an extremely long stall (~1.95 days).
>>
>>> ...
>>>    Showing busy workqueues and worker pools:
>>>    workqueue rcu_gp: flags=0x108
>>>      pwq 578: cpus=144 node=0 flags=0x4 nice=0 active=3 refcnt=4
>>> in:
>>>    https://bugzilla.suse.com/show_bug.cgi?id=1263947
>>> ?
>>>
>>> Can this (or other patch from the series) cause this? Should there be
>>> something like cpu_online() instead of task_is_running() somewhere?
>>
>> This series only affects stall reporting, not detection. The changes run
>> after the watchdog has identified a stall, so the detection logic itself
>> remains unchanged.
>>
>> To help diagnose this issue, could you provide some additional
>> information:
>>
>> 1) Was CPU 144 online at any point? If so, when was it taken offline?
> 
> It was not, it's non-present.
> 
>> 2) Does this message appear repeatedly? If you bring CPU 144 online, does
>>     the issue resolve?
> 
> Yes, look at this new x86_64 report's dmesg (I believe it is related to
> the above report):
>   BUG: workqueue lockup - pool cpus=2 node=0 flags=0x4 nice=0 stuck for
> 50s!
> in:
>   https://bugzilla.suse.com/attachment.cgi?id=890229
> 
> $ grep -c BUG sl.txt
> 504
> $ grep -c pwq sl.txt
> 509
> 
> It comes from:
> https://bugzilla.suse.com/show_bug.cgi?id=1264554
> 
>> 3) Have you run similar tests on earlier kernel versions without seeing
>>     this behavior, or is this a clear regression?
> 
> It's new in 7.0. Going back to 6.19.12 makes it disappear.
> 
> thanks,


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics
  2026-05-13  7:29         ` Thorsten Leemhuis
@ 2026-05-13  8:03           ` Jiri Slaby
  0 siblings, 0 replies; 7+ messages in thread
From: Jiri Slaby @ 2026-05-13  8:03 UTC (permalink / raw)
  To: Thorsten Leemhuis, Breno Leitao
  Cc: Tejun Heo, Lai Jiangshan, Andrew Morton, linux-kernel,
	Omar Sandoval, Song Liu, Danielle Costantino, kasan-dev,
	Petr Mladek, kernel-team, Linux kernel regressions list,
	Paul E. McKenney

On 13. 05. 26, 9:29, Thorsten Leemhuis wrote:
> On 5/11/26 07:21, Jiri Slaby wrote:
>> we currently have several reports of this. On s390, ppc64, and x86_64.
> 
> I stumbled on this by accident and this is not my area of expertise, so
> the following might be bogus:
> 
> Is this maybe the same as "Observed Workqueue lockups on offline CPUs.":
> https://lore.kernel.org/lkml/97a7d011-d573-4754-9e5d-68b562c64089@linux.ibm.com/

Thanks, looks like pretty much it. All three reports have:
rcu: srcu_init: Setting srcu_struct sizes to big.

> Fix is here:
> https://lore.kernel.org/lkml/20260508174353.905746-1-paulmck@kernel.org/

Building a kernel with this and serving to the reporters to test.

> Ciao, Thorsten
> 
>> On 07. 05. 26, 15:11, Breno Leitao wrote:
>>> Hi Jiri,
>>>
>>> On Thu, May 07, 2026 at 12:20:33PM +0200, Jiri Slaby wrote:
>>>> On 05. 03. 26, 17:15, Breno Leitao wrote:
>>>>
>>>>     BUG: workqueue lockup - pool cpus=144 node=0 flags=0x4 nice=0
>>>> stuck for
>>>> 168224s!
>>>
>>> That's an extremely long stall (~1.95 days).
>>>
>>>> ...
>>>>     Showing busy workqueues and worker pools:
>>>>     workqueue rcu_gp: flags=0x108
>>>>       pwq 578: cpus=144 node=0 flags=0x4 nice=0 active=3 refcnt=4
>>>> in:
>>>>     https://bugzilla.suse.com/show_bug.cgi?id=1263947
>>>> ?
>>>>
>>>> Can this (or other patch from the series) cause this? Should there be
>>>> something like cpu_online() instead of task_is_running() somewhere?
>>>
>>> This series only affects stall reporting, not detection. The changes run
>>> after the watchdog has identified a stall, so the detection logic itself
>>> remains unchanged.
>>>
>>> To help diagnose this issue, could you provide some additional
>>> information:
>>>
>>> 1) Was CPU 144 online at any point? If so, when was it taken offline?
>>
>> It was not, it's non-present.
>>
>>> 2) Does this message appear repeatedly? If you bring CPU 144 online, does
>>>      the issue resolve?
>>
>> Yes, look at this new x86_64 report's dmesg (I believe it is related to
>> the above report):
>>    BUG: workqueue lockup - pool cpus=2 node=0 flags=0x4 nice=0 stuck for
>> 50s!
>> in:
>>    https://bugzilla.suse.com/attachment.cgi?id=890229
>>
>> $ grep -c BUG sl.txt
>> 504
>> $ grep -c pwq sl.txt
>> 509
>>
>> It comes from:
>> https://bugzilla.suse.com/show_bug.cgi?id=1264554
>>
>>> 3) Have you run similar tests on earlier kernel versions without seeing
>>>      this behavior, or is this a clear regression?
>>
>> It's new in 7.0. Going back to 6.19.12 makes it disappear.
>>
>> thanks,
> 


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/5] workqueue: Detect stalled in-flight workers
       [not found] <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org>
       [not found] ` <20260305-wqstall_start-at-v2-4-b60863ee0899@debian.org>
@ 2026-05-13  8:53 ` Markus Elfring
       [not found] ` <abLsAi7_fU5FrYiF@pathway.suse.cz>
  2 siblings, 0 replies; 7+ messages in thread
From: Markus Elfring @ 2026-05-13  8:53 UTC (permalink / raw)
  To: Breno Leitao, kasan-dev, kernel-team, Andrew Morton,
	Lai Jiangshan, Tejun Heo
  Cc: LKML, Danielle Costantino, Omar Sandoval, Petr Mladek, Song Liu

> There is a blind spot exists in the work queue stall detecetor …

                                                       detector?


Regards,
Markus

^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <abLsAi7_fU5FrYiF@pathway.suse.cz>]

[parent not found: <abP8wDhYWwk3ufmA@gmail.com>]

* Re: [PATCH v2 0/5] workqueue: Detect stalled in-flight workers
       [not found]   ` <abP8wDhYWwk3ufmA@gmail.com>
@ 2026-05-13  8:57     ` Hillf Danton
  0 siblings, 0 replies; 7+ messages in thread
From: Hillf Danton @ 2026-05-13  8:57 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Petr Mladek, Tejun Heo, linux-kernel, Omar Sandoval,
	Danielle Costantino, kasan-dev

On Fri, 13 Mar 2026 05:24:54 -0700 Breno Leitao wrote:
> On Thu, Mar 12, 2026 at 05:38:26PM +0100, Petr Mladek wrote:
> > On Thu 2026-03-05 08:15:36, Breno Leitao wrote:
> > > There is a blind spot exists in the work queue stall detecetor (aka
> > > show_cpu_pool_hog()). It only prints workers whose task_is_running() is
> > > true, so a busy worker that is sleeping (e.g. wait_event_idle())
> > > produces an empty backtrace section even though it is the cause of the
> > > stall.
> > > 
> > > Additionally, when the watchdog does report stalled pools, the output
> > > doesn't show how long each in-flight work item has been running, making
> > > it harder to identify which specific worker is stuck.
> > > 
> > > Example of the sample code:
> > > 
> > >     BUG: workqueue lockup - pool cpus=4 node=0 flags=0x0 nice=0 stuck for 132s!
> > >     Showing busy workqueues and worker pools:
> > >     workqueue events: flags=0x100
> > >         pwq 18: cpus=4 node=0 flags=0x0 nice=0 active=4 refcnt=5
> > >         in-flight: 178:stall_work1_fn [wq_stall]
> > >         pending: stall_work2_fn [wq_stall], free_obj_work, psi_avgs_work
> > > 	...
> > >     Showing backtraces of running workers in stalled
> > >     CPU-bound worker pools:
> > >         <nothing here>
> > > 
> > > I see it happening on real machines, causing some stalls that doesn't
> > > have any backtrace. This is one of the code path:
> > > 
> > >   1) kfence executes toggle_allocation_gate() as a delayed workqueue
> > >      item (kfence_timer) on the system WQ.
> > > 
> > >   2) toggle_allocation_gate() enables a static key, which IPIs every
> > >      CPU to patch code:
> > >           static_branch_enable(&kfence_allocation_key);
> > > 
> > >   3) toggle_allocation_gate() then sleeps in TASK_IDLE waiting for a
> > >      kfence allocation to occur:
> > >           wait_event_idle(allocation_wait,
> > >                   atomic_read(&kfence_allocation_gate) > 0 || ...);
> > > 
> > >      This can last indefinitely if no allocation goes through the
> > >      kfence path (or IPIing all the CPUs take longer, which is common on
> > >      platforms that do not have NMI).
> > > 
> > >      The worker remains in the pool's busy_hash
> > >      (in-flight) but is no longer task_is_running().
> > >
> > >   4) The workqueue watchdog detects the stall and calls
> > >      show_cpu_pool_hog(), which only prints backtraces for workers
> > >      that are actively running on CPU:
> > > 
> > >           static void show_cpu_pool_hog(struct worker_pool *pool) {
> > >                   ...
> > >                   if (task_is_running(worker->task))
> > >                           sched_show_task(worker->task);
> > >           }
> > > 
> > >   5) Nothing is printed because the offending worker is in TASK_IDLE
> > >      state. The output shows "Showing backtraces of running workers in
> > >      stalled CPU-bound worker pools:" followed by nothing, effectively
> > >      hiding the actual culprit.
> > 
> > I am trying to better understand the situation. There was a reason
> > why only the worker in the running state was shown.
> > 
> > Normally, a sleeping worker should not cause a stall. The scheduler calls
> > wq_worker_sleeping() which should wake up another idle worker. There is
> > always at least one idle worker in the poll. It should start processing
> > the next pending work. Or it should fork another worker when it was
> > the last idle one.
> 
> Right, but let's look at this case:
> 
> 	 BUG: workqueue lockup - pool 55 cpu 13 curr 0 (swapper/13) stack ffff800085640000 cpus=13 node=0 flags=0x0 nice=-20 stuck for 679s!
> 	  work func=blk_mq_timeout_work data=0xffff0000ad7e3a05
> 	  Showing busy workqueues and worker pools:
> 	  workqueue events_unbound: flags=0x2
> 	    pwq 288: cpus=0-71 flags=0x4 nice=0 active=1 refcnt=2
> 	      in-flight: 4083734:btrfs_extent_map_shrinker_worker
> 	  workqueue mm_percpu_wq: flags=0x8
> 	    pwq 14: cpus=3 node=0 flags=0x0 nice=0 active=1 refcnt=2
> 	      pending: vmstat_update
> 	  pool 288: cpus=0-71 flags=0x4 nice=0 hung=0s workers=17 idle: 3800629 3959700 3554824 3706405 3759881 4065549 4041361 4065548 1715676 4086805 3860852 3587585 4065550 4014041 3944711 3744484
> 	  Showing backtraces of running workers in stalled CPU-bound worker pools:
> 		# Nothing in here
> 
> It seems CPU 13 is idle (curr = 0) and blk_mq_timeout_work has been pending for
> 679s ?
>
An idle CPU failed to process pending work, so the root cause lies outside
workqueue, and it is difficult to understand why giving more X-ray scan
to Peter helps if Paul has a bone in throat.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-13  8:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org>
     [not found] ` <20260305-wqstall_start-at-v2-4-b60863ee0899@debian.org>
2026-05-07 10:20   ` [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics Jiri Slaby
2026-05-07 13:11     ` Breno Leitao
2026-05-11  5:21       ` Jiri Slaby
2026-05-13  7:29         ` Thorsten Leemhuis
2026-05-13  8:03           ` Jiri Slaby
2026-05-13  8:53 ` [PATCH v2 0/5] workqueue: Detect stalled in-flight workers Markus Elfring
     [not found] ` <abLsAi7_fU5FrYiF@pathway.suse.cz>
     [not found]   ` <abP8wDhYWwk3ufmA@gmail.com>
2026-05-13  8:57     ` Hillf Danton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox