public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
From: Alan Stern <stern@rowland.harvard.edu>
To: Marcel Holtmann <marcel@holtmann.org>
Cc: Benjamin Berg <benjamin@sipsolutions.net>,
	linux-usb@vger.kernel.org, linux-bluetooth@vger.kernel.org
Subject: Re: Userspace enumeration hang while btusb tries to load firmware of removed device
Date: Wed, 3 Nov 2021 16:11:38 -0400	[thread overview]
Message-ID: <20211103201138.GC1529362@rowland.harvard.edu> (raw)
In-Reply-To: <BCD95F43-3C6E-4B50-9228-9F2AD93BBBA4@holtmann.org>

On Wed, Nov 03, 2021 at 08:31:03PM +0100, Marcel Holtmann wrote:
> Hi Alan,
> 
> >> a user is seeing a hang in fprintd while enumerating devices which
> >> appears to be caused by an interaction of:
> >> 
> >> * system is resuming from S3
> >> * btusb starts loading firmware
> >> * bluetooth device disappears (probably thinkpad_acpi rfkill)
> >> * libusb enumerates USB devices (fprintd in this case)
> >> 
> >> When this happens, the firmware load fails after a timeout of 10s. It
> >> appears that if userspace queries information about the root hub in
> >> question during this time, it will hang until the btusb firmware load
> >> has timed out.
> >> 
> >> Attaching the full kernel log, below an excerpt, you can see:
> >> * At :12 device removal: "usb 5-4: USB disconnect, device number 33"
> >> * libusb enumeration retrieves information about the usb5 root hub,
> >>   and blocks on this
> >> * At :14 there is a tx timeout on hci0
> >> * At :23 the firmware load finally fails
> >> * Then usb_disable_device happens
> >> * libusb/fprintd gets the usb5 HUB information and continues its
> >>   enumeration
> >> 
> >> As I see it, there may be two issues:
> >> 1. userspace should not block due to the firmware load hanging
> >> 2. btusb should give up more quickly when the device disappears
> >> 
> >> Does anyone have a good idea about the possible cause or how we can fix
> >> the problem?
> >> 
> >> Downstream issue: https://bugzilla.redhat.com/show_bug.cgi?id=2019857
> > 
> > I'm not familiar with the btusb driver, so someone on the 
> > linux-bluetooth mailing list would have a better idea about this. 
> > However, it does look as though btusb keeps the device locked during the 
> > entire 10-second period while it tries to send over the firmware, and it 
> > doesn't abort the procedure when it starts getting disconnection errors 
> > but instead persists until a timeout expires.  Keeping the device locked 
> > would certainly block lsusb.
> > 
> > In general, locking the device during a firmware upload seems like
> > the right thing to do -- you don't want extraneous transfers from
> > other processes messing up the firmware!  So overall, it appears that
> > the whole problem would be solved if the firmware transfer were
> > aborted as soon as the -ENODEV errors start appearing.
> 
> the problem seems to be that we hitting HCI command timeout. So the 
> firmware download is done via HCI commands. These commands are send to 
> the transport driver btusb.c via hdev->send (as btusb_send_frame). 
> This triggers the usb_submit_urb or queues them via data->deferred 
> anchor. All this reports back the error properly except that nobody 
> does anything with it.
> 
> See hci_send_frame() last portion:
> 
>         err = hdev->send(hdev, skb);
>         if (err < 0) {
>                 bt_dev_err(hdev, "sending frame failed (%d)", err);
>                 kfree_skb(skb);
>         }
> 
> And that is it. We are not checking for ENODEV or any error here. That 
> means the failure of the HCI command gets only caught via the HCI 
> command timeout. I don’t know how to do this yet, but you would have 
> to look there to fail HCI command right away instead of waiting for 
> the timeout.

I have no idea how all the different layers work here.  Clearly 
something has to signal hdev->req_wait_q after setting hdev->req_status 
to some appropriate value.  Can this be done in btusb.c, either when the 
URB is submitted or when it completes?  Or in hci_send_frame?

Alan Stern

  reply	other threads:[~2021-11-03 20:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-03 14:55 Userspace enumeration hang while btusb tries to load firmware of removed device Benjamin Berg
2021-11-03 18:23 ` Alan Stern
2021-11-03 19:31   ` Marcel Holtmann
2021-11-03 20:11     ` Alan Stern [this message]
2021-11-09 17:03       ` Benjamin Berg
2021-11-04  9:34     ` Benjamin Berg
2021-11-04 13:28       ` Alan Stern
2021-11-04 14:19         ` Benjamin Berg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211103201138.GC1529362@rowland.harvard.edu \
    --to=stern@rowland.harvard.edu \
    --cc=benjamin@sipsolutions.net \
    --cc=linux-bluetooth@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=marcel@holtmann.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox