From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: Linux USB <linux-usb@vger.kernel.org>,
Alan Stern <stern@rowland.harvard.edu>
Subject: [02/20] usb: host: xhci: check DYING state earlier
Date: Wed, 2 May 2018 16:02:52 +0300 [thread overview]
Message-ID: <71780ddd-d0f5-e580-af3f-401d33e36506@linux.intel.com> (raw)
On 02.05.2018 14:46, Felipe Balbi wrote:
>
> Hi,
>
> Mathias Nyman <mathias.nyman@linux.intel.com> writes:
>> On 17.04.2018 10:07, Felipe Balbi wrote:
>>>
>>> Hi,
>>>
>>> Mathias Nyman <mathias.nyman@linux.intel.com> writes:
>>>> On 16.04.2018 15:29, Felipe Balbi wrote:
>>>>> Instead of allocating urb priv and, maybe, bail out due to xhci being
>>>>> in DYING state, we can move the check earlier and avoid the memory
>>>>> allocation altogether.
>>>>
>>>> This also moves checking for DYING outside the lock.
>>>>
>>>> Most cases set DYING with lock held, so if we first get the lock before
>>>> checking DYING we have a better chance of not being in the process of dying.
>>>
>>> pretty sure that's atomic, though.
>>
>> That's not what I'm after, your fix is cleaning up code in the case where DYING flag is
>> set before xhci_urb_enqueue() is called. I'm worried about the case when setting DYING flag races
>> with xhci_urb_enqueue(). i.e. xhci_urb_enqueue() is spinning on the lock, waiting, while
>> some other part of the driver is desperately trying to access hw with lock held, failing,
>> finally setting DYING flag, and then releasing lock.
>>
>> If the check is done before taking the lock then the URB might be queued on dying device,
>> at a time when xhci_hc_died already started cancelling and giving back all queued URB
>
> this can only happen if checking that bit isn't an atomic operation
> which, AFAICT, it is. IOW, it would be the same if you were to change:
>
> if (a & b)
> return -1;
>
> to:
>
> if (test_bit(b, &a))
> return -1;
>
> right? Now, if this isn't an atomic operation, I'm happy to be educated.
Again, it's not about being atomic.
As an example lets take the get port status request racing with queueing a URB.
After this patch the following is possible:
CPU:0 CPU:1
get port status queue URB
xhci_hub_control() xhci_queue_urb()
spin_lock(lock), got it XHCI_STATE_DYING? no, continue
temp = readl(port_array[wIndex]) spin_lock(lock), already taken, spin here
if (temp == ~(u32)0) {
xhci_hc_died(xhci)
xhc_state |= XHCI_STATE_DYING
cleanup_command_queue()
kill_endpoint_urbs()
spin_unlock(lock) // at URB giveback spin_lock(lock) got it, finally
allocate urb_priv, plus other stuff
xhci_queue_*_tx()
count_trbs_needed(urb)
prepare_transfer()
queue_trb() // for each trb
So its more likely we end up queuing URBs on a dead host, a host that the driver already
started tearing down, freeing URBs. xhci_hub_control() was just one example,
you can replace it with almost any function that calls xhci_hc_died()
>
>>>> Small thing, but so is this cleanup, so not sure its worth the change
>>>
>>> Look at the result. With this change we don't need to take a lock,
>>> allocate memory, search for endpoint index, search for endpoint
>>> state. All of those are needed for proper operation of the function, but
>>> if the controller has already died, there's no point in going any
>>> further.
>>
>> But we might miss the fact that host died, and go even further, adding URB to list,
>> writing TRBs to ringbuffers etc.
>>
>> In code we save one line,
>> goto: free_priv
>
> We're saving a lot more than that, actually. All of the following ends
> up being skipped. All of these are unnecessary work when xHC has already
> died:
In lines of code in the driver it's just one line.
In extra code being run it's a gamble.
Before the patch we ran the below code, after the patch it's either nothing, or the below
code plus all the URB/TRB queuing code.
>
> 8<------------------------------------------------------------------------
>
> slot_id = urb->dev->slot_id;
> ep_index = xhci_get_endpoint_index(&urb->ep->desc);
> ep_state = &xhci->devs[slot_id]->eps[ep_index].ep_state;
>
> if (!HCD_HW_ACCESSIBLE(hcd)) {
> if (!in_interrupt())
> xhci_dbg(xhci, "urb submitted during PCI suspend\n");
> return -ESHUTDOWN;
> }
>
> if (usb_endpoint_xfer_isoc(&urb->ep->desc))
> num_tds = urb->number_of_packets;
> else if (usb_endpoint_is_bulk_out(&urb->ep->desc) &&
> urb->transfer_buffer_length > 0 &&
> urb->transfer_flags & URB_ZERO_PACKET &&
> !(urb->transfer_buffer_length % usb_endpoint_maxp(&urb->ep->desc)))
> num_tds = 2;
> else
> num_tds = 1;
>
> urb_priv = kzalloc(sizeof(struct urb_priv) +
> num_tds * sizeof(struct xhci_td), mem_flags);
> if (!urb_priv)
> return -ENOMEM;
>
> urb_priv->num_tds = num_tds;
> urb_priv->num_tds_done = 0;
> urb->hcpriv = urb_priv;
>
> trace_xhci_urb_enqueue(urb);
>
> if (usb_endpoint_xfer_control(&urb->ep->desc)) {
> /* Check to see if the max packet size for the default control
> * endpoint changed during FS device enumeration
> */
> if (urb->dev->speed == USB_SPEED_FULL) {
> ret = xhci_check_maxpacket(xhci, slot_id,
> ep_index, urb);
> if (ret < 0) {
> xhci_urb_free_priv(urb_priv);
> urb->hcpriv = NULL;
> return ret;
> }
> }
> }
>
> spin_lock_irqsave(&xhci->lock, flags);
>
> 8<------------------------------------------------------------------------
>
---
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next reply other threads:[~2018-05-02 13:02 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-02 13:02 Mathias Nyman [this message]
-- strict thread matches above, loose matches on Subject: below --
2018-05-02 14:11 [02/20] usb: host: xhci: check DYING state earlier Alan Stern
2018-05-02 11:46 Felipe Balbi
2018-05-02 11:38 Mathias Nyman
2018-04-17 7:07 Felipe Balbi
2018-04-16 13:31 Mathias Nyman
2018-04-16 12:29 Felipe Balbi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=71780ddd-d0f5-e580-af3f-401d33e36506@linux.intel.com \
--to=mathias.nyman@linux.intel.com \
--cc=felipe.balbi@linux.intel.com \
--cc=linux-usb@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).