public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* BUG: Stall on adding/removing wokers into workqueue pool
@ 2024-10-29 22:03 Tim Chen
  2024-10-30 23:42 ` Tejun Heo
  0 siblings, 1 reply; 3+ messages in thread
From: Tim Chen @ 2024-10-29 22:03 UTC (permalink / raw)
  To: Tejun Heo, Lai Jiangshan
  Cc: linux-kernel, Thomas Gleixner, Doug Nelson, bp, dave.hansen, hpa,
	mingo, syzkaller-bugs, x86

Hi Tejun,

Forwarding this task hung seen by my colleague Doug Nelson. He tested
the 6.12-rc4 kernel with an OLTP workload running on a 2 socket with
Granite Rapids CPU that has 86 cores per socket. The traces 
seem to indicate that the acquisition 
of wq_pool_attach_mutex stalled in idle_cull_fn() when removing worker from
the pool. Doug hit this problem occasionally in his tests.

Searching through the bug reports, there's a similar report by szybot on the
6.12-rc2 kernel. Szybot reported similar task hung when attaching workers to
the pool: https://lore.kernel.org/all/6706c4ba.050a0220.1139e6.0008.GAE@google.com/T/
So we suspect that the problem is not GNR CPU specific.

Wonder if this problem is a known one?

Thanks.

Tim


[Fri Oct 25 18:24:12 2024] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[Fri Oct 25 18:26:31 2024] INFO: task kworker/46:0H:300 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/46:0H   state:D stack:0     pid:300   tgid:300   ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue:  0x0 (events_highpri)
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  worker_attach_to_pool+0x1f/0xd0
[Fri Oct 25 18:26:31 2024]  create_worker+0xfa/0x1f0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x19c/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/R-kbloc:2466 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/R-kbloc state:D stack:0     pid:2466  tgid:2466  ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  worker_attach_to_pool+0x1f/0xd0
[Fri Oct 25 18:26:31 2024]  rescuer_thread+0x111/0x3b0
[Fri Oct 25 18:26:31 2024]  ? __pfx_rescuer_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ? __pfx_rescuer_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/46:1H:3592 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/46:1H   state:D stack:0     pid:3592  tgid:3592  ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue: kblockd blk_mq_timeout_work
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_timeout+0x283/0x2c0
[Fri Oct 25 18:26:31 2024]  ? sched_balance_rq+0xe5/0xd90
[Fri Oct 25 18:26:31 2024]  ? __prepare_to_swait+0x52/0x80
[Fri Oct 25 18:26:31 2024]  wait_for_completion_state+0x173/0x1d0
[Fri Oct 25 18:26:31 2024]  __wait_rcu_gp+0x121/0x150
[Fri Oct 25 18:26:31 2024]  synchronize_rcu_normal.part.63+0x3a/0x60
[Fri Oct 25 18:26:31 2024]  ? __pfx_call_rcu_hurry+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ? __pfx_wakeme_after_rcu+0x10/0x10
[Fri Oct 25 18:26:31 2024]  synchronize_rcu_normal+0x9a/0xb0
[Fri Oct 25 18:26:31 2024]  blk_mq_timeout_work+0x142/0x1a0
[Fri Oct 25 18:26:31 2024]  process_scheduled_works+0xa3/0x3e0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x117/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/u1377:1:38000 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/u1377:1 state:D stack:0     pid:38000 tgid:38000 ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue: events_unbound idle_cull_fn
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  idle_cull_fn+0x3b/0xe0
[Fri Oct 25 18:26:31 2024]  process_scheduled_works+0xa3/0x3e0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x117/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/u1377:3:46111 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/u1377:3 state:D stack:0     pid:46111 tgid:46111 ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue: events_unbound idle_cull_fn
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  ? ttwu_do_activate+0x6a/0x210
[Fri Oct 25 18:26:31 2024]  idle_cull_fn+0x3b/0xe0
[Fri Oct 25 18:26:31 2024]  process_scheduled_works+0xa3/0x3e0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x117/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/u1377:5:46411 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/u1377:5 state:D stack:0     pid:46411 tgid:46411 ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue: events_unbound idle_cull_fn
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  ? try_to_wake_up+0x22e/0x690
[Fri Oct 25 18:26:31 2024]  idle_cull_fn+0x3b/0xe0
[Fri Oct 25 18:26:31 2024]  process_scheduled_works+0xa3/0x3e0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x117/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/u1379:6:53043 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/u1379:6 state:D stack:0     pid:53043 tgid:53043 ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue: events_unbound idle_cull_fn
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  idle_cull_fn+0x3b/0xe0
[Fri Oct 25 18:26:31 2024]  process_scheduled_works+0xa3/0x3e0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x117/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/u1377:7:53460 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/u1377:7 state:D stack:0     pid:53460 tgid:53460 ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue: events_unbound idle_cull_fn
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  idle_cull_fn+0x3b/0xe0
[Fri Oct 25 18:26:31 2024]  process_scheduled_works+0xa3/0x3e0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x117/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/u1377:6:67016 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/u1377:6 state:D stack:0     pid:67016 tgid:67016 ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue:  0x0 (events_unbound)
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  worker_attach_to_pool+0x1f/0xd0
[Fri Oct 25 18:26:31 2024]  create_worker+0xfa/0x1f0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x19c/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] INFO: task kworker/u1379:4:67681 blocked for more than 122 seconds.
[Fri Oct 25 18:26:31 2024]       Tainted: G S         OE      6.12.0-rc4-dis_fgs #1
[Fri Oct 25 18:26:31 2024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Fri Oct 25 18:26:31 2024] task:kworker/u1379:4 state:D stack:0     pid:67681 tgid:67681 ppid:2      flags:0x00004000
[Fri Oct 25 18:26:31 2024] Workqueue: events_unbound idle_cull_fn
[Fri Oct 25 18:26:31 2024] Call Trace:
[Fri Oct 25 18:26:31 2024]  <TASK>
[Fri Oct 25 18:26:31 2024]  __schedule+0x347/0xd70
[Fri Oct 25 18:26:31 2024]  schedule+0x36/0xc0
[Fri Oct 25 18:26:31 2024]  schedule_preempt_disabled+0x15/0x30
[Fri Oct 25 18:26:31 2024]  __mutex_lock.isra.14+0x431/0x690
[Fri Oct 25 18:26:31 2024]  ? try_to_wake_up+0x22e/0x690
[Fri Oct 25 18:26:31 2024]  idle_cull_fn+0x3b/0xe0
[Fri Oct 25 18:26:31 2024]  process_scheduled_works+0xa3/0x3e0
[Fri Oct 25 18:26:31 2024]  worker_thread+0x117/0x240
[Fri Oct 25 18:26:31 2024]  ? __pfx_worker_thread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  kthread+0xcf/0x100
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork+0x31/0x40
[Fri Oct 25 18:26:31 2024]  ? __pfx_kthread+0x10/0x10
[Fri Oct 25 18:26:31 2024]  ret_from_fork_asm+0x1a/0x30
[Fri Oct 25 18:26:31 2024]  </TASK>
[Fri Oct 25 18:26:31 2024] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
[oracle@bhs-1 GNR startup_scripts]$ uname -a
Linux bhs-1 6.12.0-rc4-dis_fgs #1 SMP PREEMPT_DYNAMIC Fri Oct 25 07:26:06 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: BUG: Stall on adding/removing wokers into workqueue pool
  2024-10-29 22:03 BUG: Stall on adding/removing wokers into workqueue pool Tim Chen
@ 2024-10-30 23:42 ` Tejun Heo
  2024-11-01 17:37   ` Tim Chen
  0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2024-10-30 23:42 UTC (permalink / raw)
  To: Tim Chen
  Cc: Lai Jiangshan, linux-kernel, Thomas Gleixner, Doug Nelson, bp,
	dave.hansen, hpa, mingo, syzkaller-bugs, x86

Hello, Tim.

On Tue, Oct 29, 2024 at 03:03:33PM -0700, Tim Chen wrote:
> Hi Tejun,
> 
> Forwarding this task hung seen by my colleague Doug Nelson. He tested
> the 6.12-rc4 kernel with an OLTP workload running on a 2 socket with
> Granite Rapids CPU that has 86 cores per socket. The traces 
> seem to indicate that the acquisition 
> of wq_pool_attach_mutex stalled in idle_cull_fn() when removing worker from
> the pool. Doug hit this problem occasionally in his tests.
> 
> Searching through the bug reports, there's a similar report by szybot on the
> 6.12-rc2 kernel. Szybot reported similar task hung when attaching workers to
> the pool: https://lore.kernel.org/all/6706c4ba.050a0220.1139e6.0008.GAE@google.com/T/
> So we suspect that the problem is not GNR CPU specific.
> 
> Wonder if this problem is a known one?

First time I see it. The trace doesn't show who's holding the mutex. There
doesn't seem to be any place where that mutex should leak at least on a
glance, so hopefully it shouldn't be too difficult to find who's holding it.
Can you trigger sysrq-d and sysrq-t and post the output?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: BUG: Stall on adding/removing wokers into workqueue pool
  2024-10-30 23:42 ` Tejun Heo
@ 2024-11-01 17:37   ` Tim Chen
  0 siblings, 0 replies; 3+ messages in thread
From: Tim Chen @ 2024-11-01 17:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lai Jiangshan, linux-kernel, Thomas Gleixner, Doug Nelson, bp,
	dave.hansen, hpa, mingo, syzkaller-bugs, x86

On Wed, 2024-10-30 at 13:42 -1000, Tejun Heo wrote:
> Hello, Tim.
> 
> On Tue, Oct 29, 2024 at 03:03:33PM -0700, Tim Chen wrote:
> > Hi Tejun,
> > 
> > Forwarding this task hung seen by my colleague Doug Nelson. He tested
> > the 6.12-rc4 kernel with an OLTP workload running on a 2 socket with
> > Granite Rapids CPU that has 86 cores per socket. The traces 
> > seem to indicate that the acquisition 
> > of wq_pool_attach_mutex stalled in idle_cull_fn() when removing worker from
> > the pool. Doug hit this problem occasionally in his tests.
> > 
> > Searching through the bug reports, there's a similar report by szybot on the
> > 6.12-rc2 kernel. Szybot reported similar task hung when attaching workers to
> > the pool: https://lore.kernel.org/all/6706c4ba.050a0220.1139e6.0008.GAE@google.com/T/
> > So we suspect that the problem is not GNR CPU specific.
> > 
> > Wonder if this problem is a known one?
> 
> First time I see it. The trace doesn't show who's holding the mutex. There
> doesn't seem to be any place where that mutex should leak at least on a
> glance, so hopefully it shouldn't be too difficult to find who's holding it.
> Can you trigger sysrq-d and sysrq-t and post the output?
> 

I'll ask Doug to see if he can get that info when he see that hang again.  But
Syzbot does have that info in their log of the bug (https://syzkaller.appspot.com/bug?extid=8b08b50984ccfdd38ce2). 
Clipping out the relevant info below.

Tim

Log from Syzbot:
Showing all locks held in the system:
---showing only wq_pool_attach_mutex---
1 lock held by kworker/0:0/8:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-mm_pe/13:
1 lock held by khungtaskd/30:
 #0: ffffffff8e937e20 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
 #0: ffffffff8e937e20 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
 #0: ffffffff8e937e20 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x2a0 kernel/locking/lockdep.c:6720
4 locks held by kworker/u8:7/1307:
 #0: ffff88801baed948 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3204 [inline]
 #0: ffff88801baed948 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x93b/0x1850 kernel/workqueue.c:3310
 #1: ffffc9000466fd00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3205 [inline]
 #1: ffffc9000466fd00 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x976/0x1850 kernel/workqueue.c:3310
 #2: ffffffff8fcc6350 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:580
 #3: ffff88807e155428 (&wg->device_update_lock){+.+.}-{3:3}, at: wg_destruct+0x110/0x2e0 drivers/net/wireguard/device.c:249
1 lock held by kworker/R-dm_bu/2373:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
2 locks held by getty/4989:
 #0: ffff88814bfbe0a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
 #1: ffffc900031332f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6a6/0x1e00 drivers/tty/n_tty.c:2211
4 locks held by kworker/0:7/5348:
3 locks held by kworker/u8:10/6933:
1 lock held by kworker/R-wg-cr/9157:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/9159:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/9428:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/9429:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/9471:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/9472:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/9912:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/9913:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/9914:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/10385:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/10386:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/10387:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/10446:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/10447:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/10578:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/10580:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/10581:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/10673:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/10675:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
1 lock held by kworker/R-wg-cr/11051:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/11054:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_detach_from_pool kernel/workqueue.c:2727 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xaf5/0x10a0 kernel/workqueue.c:3526
1 lock held by kworker/R-wg-cr/11055:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: worker_attach_to_pool+0x31/0x390 kernel/workqueue.c:2669
7 locks held by syz-executor/11585:
 #0: ffff888032260420 (sb_writers#8){.+.+}-{0:0}, at: file_start_write include/linux/fs.h:2931 [inline]
 #0: ffff888032260420 (sb_writers#8){.+.+}-{0:0}, at: vfs_write+0x224/0xc90 fs/read_write.c:679
 #1: ffff888065546888 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x1ea/0x500 fs/kernfs/file.c:325
 #2: ffff888144f145a8 (kn->active#49){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x20e/0x500 fs/kernfs/file.c:326
 #3: ffffffff8f570e28 (nsim_bus_dev_list_lock){+.+.}-{3:3}, at: del_device_store+0xfc/0x480 drivers/net/netdevsim/bus.c:216
 #4: ffff8880609060e8 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:1014 [inline]
 #4: ffff8880609060e8 (&dev->mutex){....}-{3:3}, at: __device_driver_lock drivers/base/dd.c:1095 [inline]
 #4: ffff8880609060e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0xce/0x7c0 drivers/base/dd.c:1293
 #5: ffff88802fbdb250 (&devlink->lock_key#28){+.+.}-{3:3}, at: nsim_drv_remove+0x50/0x160 drivers/net/netdevsim/dev.c:1675
 #6: ffffffff8e93d3b8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:329 [inline]
 #6: ffffffff8e93d3b8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x451/0x830 kernel/rcu/tree_exp.h:976
1 lock held by kworker/R-bond0/11593:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-bond0/11595:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-bond0/11600:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-bond0/11605:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-bond0/11608:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11616:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11617:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11619:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11620:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11621:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11622:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11623:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11624:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11625:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11626:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11627:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11628:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11629:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11630:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11631:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-bond0/11706:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-bond0/11707:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-bond0/11708:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-bond0/11711:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11724:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11725:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11726:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11727:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11728:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11729:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11730:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11731:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11732:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11735:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11736:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by kworker/R-wg-cr/11738:
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: set_pf_worker kernel/workqueue.c:3316 [inline]
 #0: ffffffff8e7e23e8 (wq_pool_attach_mutex){+.+.}-{3:3}, at: rescuer_thread+0xd0/0x10a0 kernel/workqueue.c:3443
1 lock held by syz-executor/11786:
 #0: ffffffff8fcc6350 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x328/0x570 net/core/net_namespace.c:490
1 lock held by syz-executor/11798:
 #0: ffffffff8fcc6350 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x328/0x570 net/core/net_namespace.c:490
1 lock held by syz-executor/11800:
 #0: ffffffff8fcc6350 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x328/0x570 net/core/net_namespace.c:490
1 lock held by syz-executor/11803:
 #0: ffffffff8fcc6350 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x328/0x570 net/core/net_namespace.c:490
1 lock held by syz-executor/11804:
 #0: ffffffff8fcc6350 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x328/0x570 net/core/net_namespace.c:490
2 locks held by dhcpcd/11850:
 #0: ffff888069c52258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1611 [inline]
 #0: ffff888069c52258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: packet_do_bind+0x32/0xcb0 net/packet/af_packet.c:3266
 #1: ffffffff8e93d3b8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:329 [inline]
 #1: ffffffff8e93d3b8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x451/0x830 kernel/rcu/tree_exp.h:976
1 lock held by dhcpcd/11851:
 #0: ffff88807ed0a258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1611 [inline]
 #0: ffff88807ed0a258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: packet_do_bind+0x32/0xcb0 net/packet/af_packet.c:3266
1 lock held by dhcpcd/11852:
 #0: ffff88802925e258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1611 [inline]
 #0: ffff88802925e258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: packet_do_bind+0x32/0xcb0 net/packet/af_packet.c:3266
1 lock held by dhcpcd/11853:
 #0: ffff88802925c258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1611 [inline]
 #0: ffff88802925c258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: packet_do_bind+0x32/0xcb0 net/packet/af_packet.c:3266
1 lock held by dhcpcd/11854:
 #0: ffff8880455b6258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1611 [inline]
 #0: ffff8880455b6258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: packet_do_bind+0x32/0xcb0 net/packet/af_packet.c:3266
1 lock held by dhcpcd/11855:
 #0: ffff8880286b2258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1611 [inline]
 #0: ffff8880286b2258 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: packet_do_bind+0x32/0xcb0 net/packet/af_packet.c:3266

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-11-01 17:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-29 22:03 BUG: Stall on adding/removing wokers into workqueue pool Tim Chen
2024-10-30 23:42 ` Tejun Heo
2024-11-01 17:37   ` Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox