* Re: [4.3] kworker busy in pm_runtime_work [not found] ` <Pine.LNX.4.44L0.1511161021280.1938-100000@iolanthe.rowland.org> @ 2015-11-22 11:24 ` Daniel J Blueman 2015-11-23 15:52 ` Alan Stern 0 siblings, 1 reply; 5+ messages in thread From: Daniel J Blueman @ 2015-11-22 11:24 UTC (permalink / raw) To: linux-usb, Linux Kernel; +Cc: Alan Stern On 16 November 2015 at 23:22, Alan Stern <stern@rowland.harvard.edu> wrote: > On Mon, 16 Nov 2015, Daniel J Blueman wrote: > >> Tuning USB suspend [1] in 4.3 on a Dell XPS 15 9553 (Skylake), I see a >> kworker thread spinning in rpm_suspend [2]. >> >> What is the most useful debug to get here beyond the immediate [3]? > > You can try doing: > > echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control kworker and ksoftirqd spinning occurs when I echo 'auto' to all the USB control entries. Using Alan's excellent tip, we see this being logged repeatedly at a high rate: [ 353.245180] usb usb1-port4: status 0107 change 0000 [ 353.245194] usb usb1-port12: status 0507 change 0000 [ 353.245202] hub 1-0:1.0: state 7 ports 16 chg 0000 evt 0000 [ 353.245203] hub 1-0:1.0: hub_suspend [ 353.245205] usb usb1: bus auto-suspend, wakeup 1 [ 353.245206] usb usb1: bus suspend fail, err -16 [ 353.245207] hub 1-0:1.0: hub_resume ... So, EBUSY. Both the webcam is not open, and the bluetooth interface [1] is rfkill'd; the situation occurs even if I unload all related modules. What further debug would be useful? Thanks! Daniel -- [1] Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 0a5c:6410 Broadcom Corp. Bus 001 Device 003: ID 1bcf:2b95 Sunplus Innovation Technology Inc. Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub -- Daniel J Blueman ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [4.3] kworker busy in pm_runtime_work 2015-11-22 11:24 ` [4.3] kworker busy in pm_runtime_work Daniel J Blueman @ 2015-11-23 15:52 ` Alan Stern 2015-11-24 22:23 ` Daniel J Blueman 0 siblings, 1 reply; 5+ messages in thread From: Alan Stern @ 2015-11-23 15:52 UTC (permalink / raw) To: Daniel J Blueman; +Cc: USB list, Linux Kernel On Sun, 22 Nov 2015, Daniel J Blueman wrote: > On 16 November 2015 at 23:22, Alan Stern <stern@rowland.harvard.edu> wrote: > > On Mon, 16 Nov 2015, Daniel J Blueman wrote: > > > >> Tuning USB suspend [1] in 4.3 on a Dell XPS 15 9553 (Skylake), I see a > >> kworker thread spinning in rpm_suspend [2]. > >> > >> What is the most useful debug to get here beyond the immediate [3]? > > > > You can try doing: > > > > echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control > > kworker and ksoftirqd spinning occurs when I echo 'auto' to all the > USB control entries. Using Alan's excellent tip, we see this being > logged repeatedly at a high rate: > [ 353.245180] usb usb1-port4: status 0107 change 0000 > [ 353.245194] usb usb1-port12: status 0507 change 0000 > [ 353.245202] hub 1-0:1.0: state 7 ports 16 chg 0000 evt 0000 > [ 353.245203] hub 1-0:1.0: hub_suspend > [ 353.245205] usb usb1: bus auto-suspend, wakeup 1 > [ 353.245206] usb usb1: bus suspend fail, err -16 > [ 353.245207] hub 1-0:1.0: hub_resume > ... > > So, EBUSY. Both the webcam is not open, and the bluetooth interface > [1] is rfkill'd; the situation occurs even if I unload all related > modules. > > What further debug would be useful? > > Thanks! > Daniel > > -- [1] > > Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub > Bus 001 Device 002: ID 0a5c:6410 Broadcom Corp. > Bus 001 Device 003: ID 1bcf:2b95 Sunplus Innovation Technology Inc. > Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Since bus 1 uses an xHCI controller, you should do: echo 'module xhci-hcd =p' >/sys/kernel/debug/dynamic_debug/control I'm reasonably sure this will end up printing "suspend failed because a port is resuming", since that's the only place where xhci_bus_suspend() fails with -EBUSY, but you should try it to confirm this. Alan Stern ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [4.3] kworker busy in pm_runtime_work 2015-11-23 15:52 ` Alan Stern @ 2015-11-24 22:23 ` Daniel J Blueman 2015-11-25 15:42 ` Alan Stern 0 siblings, 1 reply; 5+ messages in thread From: Daniel J Blueman @ 2015-11-24 22:23 UTC (permalink / raw) To: Alan Stern; +Cc: USB list, Linux Kernel On 23 November 2015 at 23:52, Alan Stern <stern@rowland.harvard.edu> wrote: > On Sun, 22 Nov 2015, Daniel J Blueman wrote: > >> On 16 November 2015 at 23:22, Alan Stern <stern@rowland.harvard.edu> wrote: >> > On Mon, 16 Nov 2015, Daniel J Blueman wrote: >> > >> >> Tuning USB suspend [1] in 4.3 on a Dell XPS 15 9553 (Skylake), I see a >> >> kworker thread spinning in rpm_suspend [2]. >> >> >> >> What is the most useful debug to get here beyond the immediate [3]? >> > >> > You can try doing: >> > >> > echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control >> >> kworker and ksoftirqd spinning occurs when I echo 'auto' to all the >> USB control entries. Using Alan's excellent tip, we see this being >> logged repeatedly at a high rate: >> [ 353.245180] usb usb1-port4: status 0107 change 0000 >> [ 353.245194] usb usb1-port12: status 0507 change 0000 >> [ 353.245202] hub 1-0:1.0: state 7 ports 16 chg 0000 evt 0000 >> [ 353.245203] hub 1-0:1.0: hub_suspend >> [ 353.245205] usb usb1: bus auto-suspend, wakeup 1 >> [ 353.245206] usb usb1: bus suspend fail, err -16 >> [ 353.245207] hub 1-0:1.0: hub_resume >> ... >> >> So, EBUSY. Both the webcam is not open, and the bluetooth interface >> [1] is rfkill'd; the situation occurs even if I unload all related >> modules. >> >> What further debug would be useful? >> >> Thanks! >> Daniel >> >> -- [1] >> >> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub >> Bus 001 Device 002: ID 0a5c:6410 Broadcom Corp. >> Bus 001 Device 003: ID 1bcf:2b95 Sunplus Innovation Technology Inc. >> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub > > Since bus 1 uses an xHCI controller, you should do: > > echo 'module xhci-hcd =p' >/sys/kernel/debug/dynamic_debug/control > > I'm reasonably sure this will end up printing "suspend failed > because a port is resuming", since that's the only place where > xhci_bus_suspend() fails with -EBUSY, but you should try it to confirm > this. I had to use: echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control and indeed we see: [29172.246221] xhci_hcd 0000:00:14.0: get port status, actual port 11 status = 0xe63 [29172.246222] xhci_hcd 0000:00:14.0: Get port status returned 0x507 [29172.246224] xhci_hcd 0000:00:14.0: get port status, actual port 12 status = 0x2a0 [29172.246228] xhci_hcd 0000:00:14.0: get port status, actual port 13 status = 0x2a0 [29172.246228] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246231] xhci_hcd 0000:00:14.0: get port status, actual port 14 status = 0x2a0 [29172.246232] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246235] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246248] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246254] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246264] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246275] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246285] xhci_hcd 0000:00:14.0: Get port status returned 0x507 [29172.246294] xhci_hcd 0000:00:14.0: get port status, actual port 14 status = 0x2a0 [29172.246302] xhci_hcd 0000:00:14.0: suspend failed because a port is resuming [29172.246321] xhci_hcd 0000:00:14.0: Get port status returned 0x107 [29172.246332] xhci_hcd 0000:00:14.0: get port status, actual port 6 status = 0x2a0 [29172.246346] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246359] xhci_hcd 0000:00:14.0: get port status, actual port 13 status = 0x2a0 [29172.246364] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246366] xhci_hcd 0000:00:14.0: get port status, actual port 15 status = 0x2a0 [29172.246371] xhci_hcd 0000:00:14.0: suspend failed because a port is resuming [29172.246380] xhci_hcd 0000:00:14.0: Get port status returned 0x100 [29172.246382] xhci_hcd 0000:00:14.0: get port status, actual port 1 status = 0x2a0 [29172.246383] xhci_hcd 0000:00:14.0: Get port status returned 0x100 ... -- Daniel J Blueman ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [4.3] kworker busy in pm_runtime_work 2015-11-24 22:23 ` Daniel J Blueman @ 2015-11-25 15:42 ` Alan Stern 2015-11-26 13:28 ` Mathias Nyman 0 siblings, 1 reply; 5+ messages in thread From: Alan Stern @ 2015-11-25 15:42 UTC (permalink / raw) To: Daniel J Blueman, Mathias Nyman; +Cc: USB list, Linux Kernel On Wed, 25 Nov 2015, Daniel J Blueman wrote: > On 23 November 2015 at 23:52, Alan Stern <stern@rowland.harvard.edu> wrote: > > On Sun, 22 Nov 2015, Daniel J Blueman wrote: > > > >> On 16 November 2015 at 23:22, Alan Stern <stern@rowland.harvard.edu> wrote: > >> > On Mon, 16 Nov 2015, Daniel J Blueman wrote: > >> > > >> >> Tuning USB suspend [1] in 4.3 on a Dell XPS 15 9553 (Skylake), I see a > >> >> kworker thread spinning in rpm_suspend [2]. > >> >> > >> >> What is the most useful debug to get here beyond the immediate [3]? > >> > > >> > You can try doing: > >> > > >> > echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control > >> > >> kworker and ksoftirqd spinning occurs when I echo 'auto' to all the > >> USB control entries. Using Alan's excellent tip, we see this being > >> logged repeatedly at a high rate: > >> [ 353.245180] usb usb1-port4: status 0107 change 0000 > >> [ 353.245194] usb usb1-port12: status 0507 change 0000 > >> [ 353.245202] hub 1-0:1.0: state 7 ports 16 chg 0000 evt 0000 > >> [ 353.245203] hub 1-0:1.0: hub_suspend > >> [ 353.245205] usb usb1: bus auto-suspend, wakeup 1 > >> [ 353.245206] usb usb1: bus suspend fail, err -16 > >> [ 353.245207] hub 1-0:1.0: hub_resume > >> ... > >> > >> So, EBUSY. Both the webcam is not open, and the bluetooth interface > >> [1] is rfkill'd; the situation occurs even if I unload all related > >> modules. > >> > >> What further debug would be useful? > >> > >> Thanks! > >> Daniel > >> > >> -- [1] > >> > >> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub > >> Bus 001 Device 002: ID 0a5c:6410 Broadcom Corp. > >> Bus 001 Device 003: ID 1bcf:2b95 Sunplus Innovation Technology Inc. > >> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub > > > > Since bus 1 uses an xHCI controller, you should do: > > > > echo 'module xhci-hcd =p' >/sys/kernel/debug/dynamic_debug/control > > > > I'm reasonably sure this will end up printing "suspend failed > > because a port is resuming", since that's the only place where > > xhci_bus_suspend() fails with -EBUSY, but you should try it to confirm > > this. > > I had to use: > echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control > > and indeed we see: > [29172.246221] xhci_hcd 0000:00:14.0: get port status, actual port 11 > status = 0xe63 > [29172.246222] xhci_hcd 0000:00:14.0: Get port status returned 0x507 > [29172.246224] xhci_hcd 0000:00:14.0: get port status, actual port 12 > status = 0x2a0 > [29172.246228] xhci_hcd 0000:00:14.0: get port status, actual port 13 > status = 0x2a0 > [29172.246228] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246231] xhci_hcd 0000:00:14.0: get port status, actual port 14 > status = 0x2a0 > [29172.246232] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246235] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246248] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246254] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246264] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246275] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246285] xhci_hcd 0000:00:14.0: Get port status returned 0x507 > [29172.246294] xhci_hcd 0000:00:14.0: get port status, actual port 14 > status = 0x2a0 > [29172.246302] xhci_hcd 0000:00:14.0: suspend failed because a port is resuming > [29172.246321] xhci_hcd 0000:00:14.0: Get port status returned 0x107 > [29172.246332] xhci_hcd 0000:00:14.0: get port status, actual port 6 > status = 0x2a0 > [29172.246346] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246359] xhci_hcd 0000:00:14.0: get port status, actual port 13 > status = 0x2a0 > [29172.246364] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246366] xhci_hcd 0000:00:14.0: get port status, actual port 15 > status = 0x2a0 > [29172.246371] xhci_hcd 0000:00:14.0: suspend failed because a port is resuming > [29172.246380] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > [29172.246382] xhci_hcd 0000:00:14.0: get port status, actual port 1 > status = 0x2a0 > [29172.246383] xhci_hcd 0000:00:14.0: Get port status returned 0x100 It's time to get some assistance from the xhci-hcd maintainer. Mathias, Daniel's machine is spinning, trying to suspend the root hub on the xHCI HS/FS/LS bus and failing, over and over again. It looks like either bus_state->resuming_ports or bus_state->port_remote_wakeup is nonzero when it shouldn't be. The information we have is all above. Can you help Daniel debug this? Alan Stern ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [4.3] kworker busy in pm_runtime_work 2015-11-25 15:42 ` Alan Stern @ 2015-11-26 13:28 ` Mathias Nyman 0 siblings, 0 replies; 5+ messages in thread From: Mathias Nyman @ 2015-11-26 13:28 UTC (permalink / raw) To: Alan Stern, Daniel J Blueman; +Cc: USB list, Linux Kernel On 25.11.2015 17:42, Alan Stern wrote: > On Wed, 25 Nov 2015, Daniel J Blueman wrote: > >> On 23 November 2015 at 23:52, Alan Stern <stern@rowland.harvard.edu> wrote: >>> On Sun, 22 Nov 2015, Daniel J Blueman wrote: >>> >>>> On 16 November 2015 at 23:22, Alan Stern <stern@rowland.harvard.edu> wrote: >>>>> On Mon, 16 Nov 2015, Daniel J Blueman wrote: >>>>> >>>>>> Tuning USB suspend [1] in 4.3 on a Dell XPS 15 9553 (Skylake), I see a >>>>>> kworker thread spinning in rpm_suspend [2]. >>>>>> >>>>>> What is the most useful debug to get here beyond the immediate [3]? >>>>> >>>>> You can try doing: >>>>> >>>>> echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control >>>> >>>> kworker and ksoftirqd spinning occurs when I echo 'auto' to all the >>>> USB control entries. Using Alan's excellent tip, we see this being >>>> logged repeatedly at a high rate: >>>> [ 353.245180] usb usb1-port4: status 0107 change 0000 >>>> [ 353.245194] usb usb1-port12: status 0507 change 0000 >>>> [ 353.245202] hub 1-0:1.0: state 7 ports 16 chg 0000 evt 0000 >>>> [ 353.245203] hub 1-0:1.0: hub_suspend >>>> [ 353.245205] usb usb1: bus auto-suspend, wakeup 1 >>>> [ 353.245206] usb usb1: bus suspend fail, err -16 >>>> [ 353.245207] hub 1-0:1.0: hub_resume >>>> ... >>>> >>>> So, EBUSY. Both the webcam is not open, and the bluetooth interface >>>> [1] is rfkill'd; the situation occurs even if I unload all related >>>> modules. >>>> >>>> What further debug would be useful? >>>> >>>> Thanks! >>>> Daniel >>>> >>>> -- [1] >>>> >>>> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub >>>> Bus 001 Device 002: ID 0a5c:6410 Broadcom Corp. >>>> Bus 001 Device 003: ID 1bcf:2b95 Sunplus Innovation Technology Inc. >>>> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub >>> >>> Since bus 1 uses an xHCI controller, you should do: >>> >>> echo 'module xhci-hcd =p' >/sys/kernel/debug/dynamic_debug/control >>> >>> I'm reasonably sure this will end up printing "suspend failed >>> because a port is resuming", since that's the only place where >>> xhci_bus_suspend() fails with -EBUSY, but you should try it to confirm >>> this. >> >> I had to use: >> echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control >> >> and indeed we see: >> [29172.246221] xhci_hcd 0000:00:14.0: get port status, actual port 11 >> status = 0xe63 >> [29172.246222] xhci_hcd 0000:00:14.0: Get port status returned 0x507 >> [29172.246224] xhci_hcd 0000:00:14.0: get port status, actual port 12 >> status = 0x2a0 >> [29172.246228] xhci_hcd 0000:00:14.0: get port status, actual port 13 >> status = 0x2a0 >> [29172.246228] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246231] xhci_hcd 0000:00:14.0: get port status, actual port 14 >> status = 0x2a0 >> [29172.246232] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246235] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246248] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246254] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246264] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246275] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246285] xhci_hcd 0000:00:14.0: Get port status returned 0x507 >> [29172.246294] xhci_hcd 0000:00:14.0: get port status, actual port 14 >> status = 0x2a0 >> [29172.246302] xhci_hcd 0000:00:14.0: suspend failed because a port is resuming >> [29172.246321] xhci_hcd 0000:00:14.0: Get port status returned 0x107 >> [29172.246332] xhci_hcd 0000:00:14.0: get port status, actual port 6 >> status = 0x2a0 >> [29172.246346] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246359] xhci_hcd 0000:00:14.0: get port status, actual port 13 >> status = 0x2a0 >> [29172.246364] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246366] xhci_hcd 0000:00:14.0: get port status, actual port 15 >> status = 0x2a0 >> [29172.246371] xhci_hcd 0000:00:14.0: suspend failed because a port is resuming >> [29172.246380] xhci_hcd 0000:00:14.0: Get port status returned 0x100 >> [29172.246382] xhci_hcd 0000:00:14.0: get port status, actual port 1 >> status = 0x2a0 >> [29172.246383] xhci_hcd 0000:00:14.0: Get port status returned 0x100 > > It's time to get some assistance from the xhci-hcd maintainer. > > Mathias, Daniel's machine is spinning, trying to suspend the root hub > on the xHCI HS/FS/LS bus and failing, over and over again. It looks > like either bus_state->resuming_ports or bus_state->port_remote_wakeup > is nonzero when it shouldn't be. > > The information we have is all above. Can you help Daniel debug this? > > Alan Stern > Looks like bits in bus_state->resuming_ports may be left uncleared. We set a bit in bus_state->resuming_ports if we get a port status event with a port in resume link state. We only clear it when a "get port status" request finds the port in resume state with a resume timer passed 20ms. If something happens during this 20ms to the port, and it moves to some other link state, then resuming_ports is never cleared. There is a fix for what now appears to be a subset of this issue in 4.4-rc2, clearing the bus_state->resuming_ports in case host initiated resume code moved the port link to U0 and before the "get port status" request finds the port in resume state. I now see that this is not enough. I'll write a patch reordering how we check flags, timers and link state in "get port status" Daniel, can you test a patch like this if I write it for 4.3 kernel? -Mathias ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-11-26 13:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAMVG2sve5KgL8i067HctxePWfzFeX-qy-8nrhhyj5395zhmDow@mail.gmail.com>
[not found] ` <Pine.LNX.4.44L0.1511161021280.1938-100000@iolanthe.rowland.org>
2015-11-22 11:24 ` [4.3] kworker busy in pm_runtime_work Daniel J Blueman
2015-11-23 15:52 ` Alan Stern
2015-11-24 22:23 ` Daniel J Blueman
2015-11-25 15:42 ` Alan Stern
2015-11-26 13:28 ` Mathias Nyman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox