linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* USB: workqueues stuck in 'D' state?
@ 2024-12-06 23:07 Linus Torvalds
  2024-12-07  2:02 ` Alan Stern
  2024-12-08 15:51 ` Greg Kroah-Hartman
  0 siblings, 2 replies; 6+ messages in thread
From: Linus Torvalds @ 2024-12-06 23:07 UTC (permalink / raw)
  To: Mathias Nyman, Greg Kroah-Hartman; +Cc: USB subsystem

So I'm not sure if this is new or not, but I *think* I would have
noticed it earlier.

On my Ampere Altra (128-core arm64 system), I started seeing 'top'
claiming a load average of roughly 2.3 even when idle, and it seems to
be all due to this:

  $ ps ax | grep ' [DR] '
    869 ?        D      0:00 [kworker/24:1+usb_hub_wq]
   1900 ?        D      0:00 [kworker/24:7+pm]

where sometimes there are multiple of those 'pm' workers.

Doing a sysrq-w, I get this:

  task:kworker/24:3    state:D stack:0     pid:1308  tgid:1308  ppid:2
     flags:0x00000008
  Workqueue: pm pm_runtime_work
  Call trace:
   __switch_to+0xf4/0x168 (T)
   __schedule+0x248/0x648
   schedule+0x3c/0xe0
   usleep_range_state+0x118/0x150
   xhci_hub_control+0xe80/0x1090
   rh_call_control+0x274/0x7a0
   usb_hcd_submit_urb+0x13c/0x3a0
   usb_submit_urb+0x1c8/0x600
   usb_start_wait_urb+0x7c/0x180
   usb_control_msg+0xcc/0x150
   usb_port_suspend+0x414/0x510
   usb_generic_driver_suspend+0x68/0x90
   usb_suspend_both+0x1c8/0x290
   usb_runtime_suspend+0x3c/0xb0
   __rpm_callback+0x50/0x1f0
   rpm_callback+0x70/0x88
   rpm_suspend+0xe8/0x5a8
   __pm_runtime_suspend+0x4c/0x130
   usb_runtime_idle+0x48/0x68
   rpm_idle+0xa4/0x358
   pm_runtime_work+0xb0/0xe0

  task:kworker/24:7    state:D stack:0     pid:1900  tgid:1900  ppid:2
     flags:0x00000208
  Workqueue: pm pm_runtime_work
  Call trace:
   __switch_to+0xf4/0x168 (T)
   __schedule+0x248/0x648
   schedule+0x3c/0xe0
   usleep_range_state+0x118/0x150
   xhci_hub_control+0xe80/0x1090
   rh_call_control+0x274/0x7a0
   usb_hcd_submit_urb+0x13c/0x3a0
   usb_submit_urb+0x1c8/0x600
   usb_start_wait_urb+0x7c/0x180
   usb_control_msg+0xcc/0x150
   usb_port_suspend+0x414/0x510
   usb_generic_driver_suspend+0x68/0x90
   usb_suspend_both+0x1c8/0x290
   usb_runtime_suspend+0x3c/0xb0
   __rpm_callback+0x50/0x1f0
   rpm_callback+0x70/0x88
   rpm_suspend+0xe8/0x5a8
   __pm_runtime_suspend+0x4c/0x130

so it seems to be all in that xhci_hub_control() path. I'm not seeing
anything that has changed in the xhci driver in this merge window, so
maybe this goes back further, and I just haven't noticed this odd load
average issue before.

The call trace for the usb_hub_wq seems a lot less stable, but I've
seen backtraces like

  task:kworker/24:1    state:D stack:0     pid:869   tgid:869   ppid:2
     flags:0x00000008
  Workqueue: usb_hub_wq hub_event
  Call trace:
   __switch_to+0xf4/0x168 (T)
   __schedule+0x248/0x648
   schedule+0x3c/0xe0
   schedule_preempt_disabled+0x2c/0x50
   __mutex_lock.constprop.0+0x478/0x968
   __mutex_lock_slowpath+0x1c/0x38
   mutex_lock+0x6c/0x88
   hub_event+0x144/0x4a0
   process_one_work+0x170/0x408
   worker_thread+0x2cc/0x400
   kthread+0xf4/0x108
   ret_from_fork+0x10/0x20

But also just

  Workqueue: usb_hub_wq hub_event
  Call trace:
   __switch_to+0xf4/0x168 (T)
   usb_control_msg+0xcc/0x150

or

  Workqueue: usb_hub_wq hub_event
  Call trace:
   __switch_to+0xf4/0x168 (T)
   __schedule+0x248/0x648
   schedule+0x3c/0xe0
   schedule_timeout+0x94/0x120
   msleep+0x30/0x50

so at a guess it's just some interaction with that 'pm' workqueue.

I did a reboot just to verify that yes, it happened again after a
fresh boot. So it is at least *somewhat* consistently repeatable,
although I wouldn't be surprised if it's some kind of timing-dependent
race condition that just happens to trigger on this machine.

I could try to see if it's so consistent that I could bisect it, but
before I start doing that, maybe just the backtraces makes somebody go
"Ahh, that smells like XYZ"..

Hmm?

             Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-12-09 23:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-06 23:07 USB: workqueues stuck in 'D' state? Linus Torvalds
2024-12-07  2:02 ` Alan Stern
2024-12-08 15:51 ` Greg Kroah-Hartman
2024-12-08 19:14   ` Linus Torvalds
2024-12-08 20:17     ` Michał Pecio
2024-12-09 23:27       ` Michał Pecio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).