public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Paul Zimmerman <pauldzim@gmail.com>
To: Alan Stern <stern@rowland.harvard.edu>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	David Heinzelmann <heinzelmann.david@gmail.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-usb@vger.kernel.org>
Subject: Re: [REGRESSION][BISECTED] 5.5-rc suspend/resume failure caused by patch a4f55d8b8c14 ("usb: hub: Check device descriptor before resusciation")
Date: Tue, 21 Jan 2020 19:31:31 -0700	[thread overview]
Message-ID: <20200121193131.070a28bf@EliteBook> (raw)

On Mon, 20 Jan 2020 13:52:15 -0700 Paul Zimmerman <pauldzim@gmail.com> wrote:

> On Mon, 20 Jan 2020 10:23:11 -0500 (EST) Alan Stern <stern@rowland.harvard.edu> wrote:
>   
> > On Sun, 19 Jan 2020, Paul Zimmerman wrote:
> >     
> > > I reported this regression last week (see
> > > https://lore.kernel.org/linux-usb/20200115153714.03d5b3aa@EliteBook/T/#u)
> > > but I got no response to my email. Today I have retested with
> > > 5.5-rc7 and verified that the problem still exists. So I am
> > > resending with a different subject line to see if anyone responds.
> > > 
> > > The $subject patch causes a regression on my HP EliteBook laptop
> > > with a built-in USB bluetooth adapter. About 50% of the time, a
> > > suspend/resume cycle will cause the bluetooth adapter to stop
> > > working.
> > > 
> > > The dmesg log below shows two suspend/resume cycles. At time
> > > 63.928 you can see the bluetooth adapter being successfully
> > > resumed, and at time 140.969 you can see it fail. After reverting
> > > the patch, the bluetooth adapter resumes 100% of the time.
> > > 
> > > I also included below a lsusb -v of the bluetooth adapter. Is
> > > there any other debugging info you'd like me to send?      
> > 
> > It looks like your dmesg log was made without enabling debugging 
> > messages in usbcore.  Can you collect another log with debugging 
> > messages turned on?
> > 
> > 	echo 'module usbcore =p'    
> > >/sys/kernel/debug/dynamic_debug/control    
> > 
> > Also, it might not hurt to collect and post a usbmon trace for a bad
> > suspend-resume cycle.    
> 
> Hi Alan,
> 
> Thanks for responding. The new dmesg log and the usbmon trace are
> below. The dmesg shows a good suspend/resume followed by a bad one.
> The bluetooth device is usb 2-3.2 I believe. The usbmon trace is only
> for the failed suspend/resume case.  

I did some more debugging on this using ftrace, here is an annotated
log that shows what I think is happening on a failed resume. Note that
hub_port_connect_change() is the function that the patch modified to
call usb_get_device_descriptor() in some circumstances.

This first call to hub_port_connect_change() is for a different device,
not the failing one. There are multiple other calls to that function in
the trace, but they are not for the device in question:
     kworker/1:6-19987 [001]  7803.175058: funcgraph_entry:                      |  hub_port_connect_change() {
     kworker/1:6-19987 [001]  7803.175069: funcgraph_entry:      ! 11911.151 us  |    usb_get_device_descriptor();
     kworker/2:5-20675 [002]  7803.179333: funcgraph_entry:                      |  usb_probe_interface() {

Starting here is a worker thread that is trying to setup the bluetooth
adapter after the resume:
   kworker/u17:1-3175  [002]  7803.179457: funcgraph_entry:                      |  hci_power_on() {
   kworker/u17:1-3175  [002]  7803.179458: funcgraph_entry:                      |    hci_dev_do_open() {
   kworker/u17:1-3175  [002]  7803.179468: funcgraph_entry:                      |      btusb_setup_intel() {
     kworker/2:5-20675 [002]  7803.179480: funcgraph_exit:       ! 147.596 us    |  }

And here is a worker thread that is handling the connect change on the USB
port with the bluetooth device. This happens while the btusb_setup_intel()
function called by the other thread above is still running:
     kworker/3:1-16790 [003]  7803.181323: funcgraph_entry:                      |  hub_port_connect_change() {
     kworker/3:1-16790 [003]  7803.181330: funcgraph_entry:                      |    usb_get_device_descriptor() {
     kworker/0:2-19962 [000]  7803.184885: funcgraph_entry:        6.016 us      |  hub_port_connect_change();
     kworker/0:2-19962 [000]  7803.187208: funcgraph_entry:        5.462 us      |  hub_port_connect_change();
     kworker/1:6-19987 [001]  7803.187835: funcgraph_exit:       ! 12776.810 us  |  }
     kworker/1:6-19987 [001]  7803.187846: funcgraph_entry:      ! 767445.344 us |  hub_port_connect_change();

By this point the failure has happened ("Bluetooth: hci0: Reading Intel
version information failed (-110)") , and it looks like somebody
queues another work to try setting up the bluetooth adapter again:
       hciconfig-21074 [000]  7803.193549: funcgraph_entry:                      |  hci_dev_open() {
     kworker/1:6-19987 [001]  7803.955315: funcgraph_entry:      ! 144039.307 us |  hub_port_connect_change();
   kworker/u17:1-3175  [003]  7805.203325: funcgraph_exit:       ! 2023855 us    |      }
   kworker/u17:1-3175  [003]  7805.211728: funcgraph_exit:       ! 2032269 us    |    }
   kworker/u17:1-3175  [003]  7805.211731: funcgraph_exit:       ! 2032273 us    |  }
       hciconfig-21074 [000]  7805.211760: funcgraph_entry:                      |    hci_dev_do_open() {
       hciconfig-21074 [000]  7805.211777: funcgraph_entry:      ! 3069605 us    |      btusb_setup_intel();

And here the usb_get_device_descriptor() call made by the USB worker thread
finally completes. It also fails ("kworker/3:1 timed out on ep0in len=0/18"):
     kworker/3:1-16790 [003]  7808.277224: funcgraph_exit:       ! 5095893 us    |    }
       hciconfig-21074 [001]  7808.283313: funcgraph_exit:       ! 3071554 us    |    }
       hciconfig-21074 [001]  7808.283315: funcgraph_exit:       ! 5089767 us    |  }

So if I'm understanding this correctly, there are two threads that are
trying to access the USB bluetooth device at the same time. I have no
idea if that is how it's supposed to work.

Thanks,
Paul

             reply	other threads:[~2020-01-22  2:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-22  2:31 Paul Zimmerman [this message]
2020-01-22 19:31 ` [REGRESSION][BISECTED] 5.5-rc suspend/resume failure caused by patch a4f55d8b8c14 ("usb: hub: Check device descriptor before resusciation") Alan Stern
2020-01-22 23:54   ` Paul Zimmerman
2020-01-23 15:50     ` Alan Stern
2020-01-30 21:16       ` Paul Zimmerman
2020-01-31 15:37         ` Alan Stern
  -- strict thread matches above, loose matches on Subject: below --
2020-01-15 22:37 Regression caused by patch a4f55d8b8c14 "usb: hub: Check device descriptor before resusciation" Paul Zimmerman
2020-01-20  4:11 ` [REGRESSION][BISECTED] 5.5-rc suspend/resume failure caused by patch a4f55d8b8c14 ("usb: hub: Check device descriptor before resusciation") Paul Zimmerman
2020-01-20 15:23   ` Alan Stern
2020-01-20 20:52     ` Paul Zimmerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200121193131.070a28bf@EliteBook \
    --to=pauldzim@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heinzelmann.david@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox