All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathias Nyman <mathias.nyman@intel.com>
To: Alan Stern <stern@rowland.harvard.edu>,
	Daniel J Blueman <daniel@quora.org>
Cc: USB list <linux-usb@vger.kernel.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [4.3] kworker busy in pm_runtime_work
Date: Thu, 26 Nov 2015 15:28:39 +0200	[thread overview]
Message-ID: <56570907.7030209@intel.com> (raw)
In-Reply-To: <Pine.LNX.4.44L0.1511251037370.1679-100000@iolanthe.rowland.org>

On 25.11.2015 17:42, Alan Stern wrote:
> On Wed, 25 Nov 2015, Daniel J Blueman wrote:
>
>> On 23 November 2015 at 23:52, Alan Stern <stern@rowland.harvard.edu> wrote:
>>> On Sun, 22 Nov 2015, Daniel J Blueman wrote:
>>>
>>>> On 16 November 2015 at 23:22, Alan Stern <stern@rowland.harvard.edu> wrote:
>>>>> On Mon, 16 Nov 2015, Daniel J Blueman wrote:
>>>>>
>>>>>> Tuning USB suspend [1] in 4.3 on a Dell XPS 15 9553 (Skylake), I see a
>>>>>> kworker thread spinning in rpm_suspend [2].
>>>>>>
>>>>>> What is the most useful debug to get here beyond the immediate [3]?
>>>>>
>>>>> You can try doing:
>>>>>
>>>>>          echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
>>>>
>>>> kworker and ksoftirqd spinning occurs when I echo 'auto' to all the
>>>> USB control entries. Using Alan's excellent tip, we see this being
>>>> logged repeatedly at a high rate:
>>>> [  353.245180] usb usb1-port4: status 0107 change 0000
>>>> [  353.245194] usb usb1-port12: status 0507 change 0000
>>>> [  353.245202] hub 1-0:1.0: state 7 ports 16 chg 0000 evt 0000
>>>> [  353.245203] hub 1-0:1.0: hub_suspend
>>>> [  353.245205] usb usb1: bus auto-suspend, wakeup 1
>>>> [  353.245206] usb usb1: bus suspend fail, err -16
>>>> [  353.245207] hub 1-0:1.0: hub_resume
>>>> ...
>>>>
>>>> So, EBUSY. Both the webcam is not open, and the bluetooth interface
>>>> [1] is rfkill'd; the situation occurs even if I unload all related
>>>> modules.
>>>>
>>>> What further debug would be useful?
>>>>
>>>> Thanks!
>>>>    Daniel
>>>>
>>>> -- [1]
>>>>
>>>> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
>>>> Bus 001 Device 002: ID 0a5c:6410 Broadcom Corp.
>>>> Bus 001 Device 003: ID 1bcf:2b95 Sunplus Innovation Technology Inc.
>>>> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>>>
>>> Since bus 1 uses an xHCI controller, you should do:
>>>
>>>          echo 'module xhci-hcd =p' >/sys/kernel/debug/dynamic_debug/control
>>>
>>> I'm reasonably sure this will end up printing "suspend failed
>>> because a port is resuming", since that's the only place where
>>> xhci_bus_suspend() fails with -EBUSY, but you should try it to confirm
>>> this.
>>
>> I had to use:
>> echo 'module xhci_hcd =p' >/sys/kernel/debug/dynamic_debug/control
>>
>> and indeed we see:
>> [29172.246221] xhci_hcd 0000:00:14.0: get port status, actual port 11
>> status  = 0xe63
>> [29172.246222] xhci_hcd 0000:00:14.0: Get port status returned 0x507
>> [29172.246224] xhci_hcd 0000:00:14.0: get port status, actual port 12
>> status  = 0x2a0
>> [29172.246228] xhci_hcd 0000:00:14.0: get port status, actual port 13
>> status  = 0x2a0
>> [29172.246228] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246231] xhci_hcd 0000:00:14.0: get port status, actual port 14
>> status  = 0x2a0
>> [29172.246232] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246235] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246248] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246254] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246264] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246275] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246285] xhci_hcd 0000:00:14.0: Get port status returned 0x507
>> [29172.246294] xhci_hcd 0000:00:14.0: get port status, actual port 14
>> status  = 0x2a0
>> [29172.246302] xhci_hcd 0000:00:14.0: suspend failed because a port is resuming
>> [29172.246321] xhci_hcd 0000:00:14.0: Get port status returned 0x107
>> [29172.246332] xhci_hcd 0000:00:14.0: get port status, actual port 6
>> status  = 0x2a0
>> [29172.246346] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246359] xhci_hcd 0000:00:14.0: get port status, actual port 13
>> status  = 0x2a0
>> [29172.246364] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246366] xhci_hcd 0000:00:14.0: get port status, actual port 15
>> status  = 0x2a0
>> [29172.246371] xhci_hcd 0000:00:14.0: suspend failed because a port is resuming
>> [29172.246380] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>> [29172.246382] xhci_hcd 0000:00:14.0: get port status, actual port 1
>> status  = 0x2a0
>> [29172.246383] xhci_hcd 0000:00:14.0: Get port status returned 0x100
>
> It's time to get some assistance from the xhci-hcd maintainer.
>
> Mathias, Daniel's machine is spinning, trying to suspend the root hub
> on the xHCI HS/FS/LS bus and failing, over and over again.  It looks
> like either bus_state->resuming_ports or bus_state->port_remote_wakeup
> is nonzero when it shouldn't be.
>
> The information we have is all above.  Can you help Daniel debug this?
>
> Alan Stern
>

Looks like bits in bus_state->resuming_ports may be left uncleared.

We set a bit in bus_state->resuming_ports if we get a port status event with a port
in resume link state. We only clear it when a "get port status" request finds the port
in resume state with a resume timer passed 20ms.

If something happens during this 20ms to the port, and it moves to some other link state,
then resuming_ports is never cleared.

There is a fix for what now appears to be a subset of this issue in 4.4-rc2, clearing
the bus_state->resuming_ports in case host initiated resume code moved the port link to
U0 and before the "get port status" request finds the port in resume state.
I now see that this is not enough.

I'll write a patch reordering how we check flags, timers and link state in "get port status"

Daniel, can you test a patch like this if I write it for 4.3 kernel?

-Mathias

      reply	other threads:[~2015-11-26 13:23 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAMVG2sve5KgL8i067HctxePWfzFeX-qy-8nrhhyj5395zhmDow@mail.gmail.com>
     [not found] ` <Pine.LNX.4.44L0.1511161021280.1938-100000@iolanthe.rowland.org>
2015-11-22 11:24   ` [4.3] kworker busy in pm_runtime_work Daniel J Blueman
2015-11-23 15:52     ` Alan Stern
2015-11-24 22:23       ` Daniel J Blueman
2015-11-25 15:42         ` Alan Stern
2015-11-26 13:28           ` Mathias Nyman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56570907.7030209@intel.com \
    --to=mathias.nyman@intel.com \
    --cc=daniel@quora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.