From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: "Michał Pecio" <michal.pecio@gmail.com>,
"Alan Stern" <stern@rowland.harvard.edu>
Cc: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>
Subject: Re: How are halted endpoints supposed to be handled in Linux?
Date: Thu, 21 Nov 2024 16:08:15 +0200 [thread overview]
Message-ID: <f5a2d3f8-b895-4617-b11e-cc134e3922c3@linux.intel.com> (raw)
In-Reply-To: <20241121112653.06ba4ee5@foxbook>
On 21.11.2024 12.26, Michał Pecio wrote:
> Hi Alan,
>
> Thank you for taking the time to answer. I'm beginning to regret not
> asking this question earlier.
>
> On Wed, 20 Nov 2024 21:31:29 -0500, Alan Stern wrote:
>>>> Linux appears to ignore this part and only reset on STALL
>>>> handshake, as advised in
>>>> Documentation/driver-api/usb/error-codes.rst and practiced by
>>>> drivers - they don't seem to bother with usb_clear_halt() on
>>>> -EPROTO.
For xhci I assumed that the device endpoint is halted when we receive
a 'Stall Error' transfer event for bulk or interrupt transfers.
Other errors such as 'Transaction Error' do halt the host side,
but does not necessarily mean whole endpoint is halted. This is based
on the info in xhci 4.6.8.1 "Soft Retry" :
"xHC shall halt an endpoint with a USB Transaction Error after CErr
retries have been performed. The USB device is not aware that the xHC
has halted the endpoint, and will be waiting for another retry, so a
Soft Retry may be used to perform additional retries and recover from
an error which has caused the xHC to halt an endpoint."
>>
>> This is generally a weakness in the drivers. It would be nice if the
>> class specifications said what to do in these situations, but most of
>> them don't.
>
> There is also proprietary hardware with no specification at all.
>
>>>> On the HCD side, xHCI will:
>>>> - give back the current URB with -EPROTO/-EPIPE status
>>>> - reset the host side endpoint, clearing its toggle state
>>>> - point the HC at the next URB if one exist
>>>> - restart the endpoint without waiting for hcd->endpoint_reset()
Intention was not to restart the endpoint, but turns out we end up doing it.
Basically we should not ring the doorbell when 'Set TR Deq' command completes
for a bulk or interrupt endpoint in case the command was queued to resolve a
Stall Error. For control endpoint we should restart the ring (xHC halts
the internal control endpoint on errors/stall)
>>>> - ignore one subsequent call to hcd->endpoint_reset()
>>
>> This behavior is not correct. See the kerneldoc for
>> usb_unlink_urb() in drivers/usb/core/urb.c, especially the section
>> labelled "Unlinking and Endpoint Queues".
>
> OK, let's go through it.
>
> * But when an URB terminates with an
> * error its queue generally stops (see below), at least until that URB's
> * completion routine returns.
>
> I don't see this working after xHCI adopted bottom half giveback, which
> is asynchronous. As you are the maintainer of EHCI, which also uses BH,
> how is EHCI dealing with it?
>
> One way I see with existing APIs is to wait until the driver submits a
> new URB, but are drivers prepared for this? Is EHCI doing the same?
Using a BH also means class driver may queue a new URB to xhci after xhci has
cleared its internal endpoint halt condition, but before class driver is
aware that endpoint halted.
>
> * It is guaranteed that a stopped queue
> * will not restart until all its unlinked URBs have been fully retired,
> * with their completion routines run
>
> I think xHCI can currently guarantee that nothing is restarted until
> any URB unlinked for any reason is given to the BH worker, but that's
> all we have, and I just broke it in usb-next:
>
> + /* Try to restart the endpoint if all is done */
> + ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
> + /* Start giving back any TDs invalidated above */
> + xhci_giveback_invalidated_tds(ep);
>
> This is part of a legitimate bugfix patch tagged for stable. I should
> have insisted on keeping it a separate change, but it seemed a good idea
> at the time which would soon get implemented anyway... Maybe no more?
I don't think your patch makes the existing issue worse in any significant
way.
Thanks
Mathias
next prev parent reply other threads:[~2024-11-21 14:05 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-20 23:11 How are halted endpoints supposed to be handled in Linux? Michał Pecio
2024-11-21 0:02 ` Thinh Nguyen
2024-11-21 2:31 ` Alan Stern
2024-11-21 10:26 ` Michał Pecio
2024-11-21 14:08 ` Mathias Nyman [this message]
2024-11-22 11:54 ` Michał Pecio
2024-11-21 15:06 ` Alan Stern
2024-11-22 12:57 ` Michał Pecio
2024-11-22 19:28 ` Alan Stern
2024-11-22 23:25 ` Michał Pecio
2024-11-23 2:39 ` Alan Stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f5a2d3f8-b895-4617-b11e-cc134e3922c3@linux.intel.com \
--to=mathias.nyman@linux.intel.com \
--cc=Thinh.Nguyen@synopsys.com \
--cc=linux-usb@vger.kernel.org \
--cc=michal.pecio@gmail.com \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox