Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error

public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed

From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: "stern@rowland.harvard.edu" <stern@rowland.harvard.edu>
Cc: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"michal.pecio@gmail.com" <michal.pecio@gmail.com>,
	"oneukum@suse.com" <oneukum@suse.com>,
	"niklas.neronin@linux.intel.com" <niklas.neronin@linux.intel.com>
Subject: Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error
Date: Tue, 31 Mar 2026 12:34:54 +0300	[thread overview]
Message-ID: <54121929-d775-45a0-b31d-09b783aada5d@linux.intel.com> (raw)
In-Reply-To: <9e62ef5a-5b31-4fca-891b-d028f5bbfc05@rowland.harvard.edu>

On 3/30/26 17:17, stern@rowland.harvard.edu wrote:
> On Mon, Mar 30, 2026 at 03:51:46PM +0300, Mathias Nyman wrote:
>> Ideally xhci driver would return the URB with EPIPE after STALL, and not continue
>> processing URBs before a clear halt is sent, or a new URB is enqueued.
>> USB core would hold off submitting URBs to xhci, buffering URBs enqueued for this
>> STALLED endpoint until class driver urb->complete() finishes for that EPIPE URB.
>>
>> Usb core could flag this endpoint as "halt_pending" before adding the EPIPE URB to
>> the bh workqueue. Then after urb->complete() work is called and core is sure class
>> driver  is aware of the EPIPE, then core would clear the flag and flush the buffered
>> URBs to the host controller drivers, restarting the ring
> 
> This is not practical; it would result in a big slowdown for large bulk
> transfers.  Doing this would mean the core could not send an URB to the
> HCD until all the previous URBs for that endpoint had completed and the
> interrupt handler had run, which would add significant latency to
> transfers.

The URB submit buffering in core would only take place if endpoint is halted
with EPIPE/EPROTO.

Additional latency should be limited to the time between interrupt handler
returns a URB with EPIPE/EPROTO, and the bh workqueue finishing urb->complete()
for that URB

URBs would normally be sent to HCD directly.

Yes, this might end up being quite complex, need to make sure it solves more
issues than it creates.

Usb core is aware of the halted endpoint before the class driver due to the bh
workqueue giveback. To me it would make sense to make core responsible for babysitting
the class driver urb submission for the time it withholds this information.

Whole reason for this is to prevent new URB submission from HCD restarting an endpoint
after HCD gave back a halted URB, and HCD assuming core/class drivers are aware of
the halt when the new URB is submitted.

Other option is that usb core would just refuse class driver from submitting URBs
during this time. usb_submit_urb() would return with an error, but I think this might
impact existing class drivers more.

> 
> Exactly what should happen to URBs coming after one that completes with
> -EPIPE is not immediately obvious.  If the HCD did try to send them to
> the device right away then they would also be stalled, which is
> obviously non-productive.  The only guarantee the kernel makes
> about this situation is that the endpoint queue won't restart
> until all completed or unlinked URBs have been given back (likewise for
> -EPROTO errors).

My take is that endpoint should stop processing URBs, existing pending URBs
should be retired by class/core, new URBs should restart the endpoint but new URBs
are only permitted _after_ submitter is aware of the halt.

A class driver testing USB specification should be able to resubmit a stalled URB
(EPIPE) and expect it to complete with EPIPE again until it clears the halt.

> 
> The only safe course available to class drivers is to unlink all the
> pending URBs.  In theory the core could do this for all drivers
> automatically, but at present it doesn't.
> 
>> Class driver urb->complete() would most likely retire/cancel the pending URBs, both the
>> earlier queued 'tainted' URBs, and the most recent 'buffered' URBs in usb core.
>> Class driver should clear the halt, but is free to do whatever it wants.
>> It could choose to send a new URB without clearing the halt,
>> have it processed, trigger STALL again, and get URB returned with EPIPE status.
>>
>> I don't think most class/usb device drivers really handle stall that well.
>> Above might be wishful thinking.
> 
> Indeed.  We can make life a little easier for drivers, but clearing the
> endpoint halt is up to them.
> 
>>> Currently you have the xhci driver to "retire" the halted URBs. However,
>>> you also noted that class/core may attempt to retire the pending URBs.
>>> Who's expected to handle the retirement here?
>>
>> Maybe we should let core retire the pending URBS, and only fix the xhci driver
>> 'automatic endpoint restart after stall' part after that core change is done.
>>
>> Should cause less regression.
>>
>>>
>>> On a separate note, will you plan to implement the clear-halt for EPROTO
>>> in xhci?
>>
>> I don't think this should be part of xhci driver. Decision to send control requests
>> to the device should be done by core or class drivers.
> 
> Here's a troubling consequence for people to consider: Suppose an
> endpoint queue stops with -EPROTO or -EPIPE, and before the class driver
> gets around to calling usb_clear_halt(), it is unbound.  What happens
> the next time a driver binds to the device and tries to use the
> endpoint?

The disable/enable interface and set config calls during unbind/bind should,
if I remember correctly flush pending URBs and drop and re-add the endpoint,
clearing the xhci side halt and reset toggle.

> 
> In particular: What does xhci-hcd do if an URB is submitted for an
> endpoint whose queue is currently stopped?  Does it reject the
> submission with some sort of error code, or does it go ahead and add the
> URB to the end of the non-advancing queue?  If the latter is true, how
> will a newly bound driver ever discover that the queue is stopped?

xhci-hcd will queue the new URB  and restart the ring, if device side endpoint
remains halted due to uncleared halt then transfer stalls and URB is returned
with EPIPE.

Thanks
Mathias

next prev parent reply	other threads:[~2026-03-31  9:35 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-23 12:25 [RFC PATCH 0/2] fix xhci endpoint restart at EPROTO Mathias Nyman
2026-03-23 12:25 ` [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error Mathias Nyman
2026-03-25  1:52   ` Thinh Nguyen
2026-03-25  9:38     ` Mathias Nyman
2026-03-26  1:19       ` Thinh Nguyen
2026-03-26 11:25         ` Mathias Nyman
2026-03-26 23:24           ` Thinh Nguyen
2026-03-30 12:51             ` Mathias Nyman
2026-03-30 14:17               ` stern
2026-03-31  9:34                 ` Mathias Nyman [this message]
2026-03-31 15:31                   ` stern
2026-04-01 22:08                     ` Mathias Nyman
2026-04-02  2:36                       ` stern
2026-04-03  1:59                         ` Thinh Nguyen
2026-04-03  2:42                           ` stern
2026-04-03  8:51                             ` Michal Pecio
2026-04-03 14:55                               ` stern
2026-04-03 19:13                                 ` xhci-hcd and URB_SHORT_NOT_OK Michal Pecio
2026-04-03 20:17                                   ` stern
2026-04-04  1:15                             ` [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error Thinh Nguyen
2026-04-04  1:54                               ` stern
2026-04-04 20:41                                 ` Thinh Nguyen
2026-04-04 21:54                                   ` Alan Stern
2026-04-04 22:15                                     ` Thinh Nguyen
2026-04-04 22:28                                       ` Thinh Nguyen
2026-04-05  1:30                                         ` Alan Stern
2026-04-05  3:10                                           ` Thinh Nguyen
2026-04-07 15:23                                             ` Alan Stern
2026-04-07 20:24                                               ` Mathias Nyman
2026-04-01 22:08               ` Thinh Nguyen
2026-04-01 22:34                 ` Mathias Nyman
2026-04-01 22:47                   ` Thinh Nguyen
2026-03-23 12:25 ` [RFC PATCH 2/2] xhci: Ensure URB is given back when endpoint halts on a multi-TD URB Mathias Nyman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54121929-d775-45a0-b31d-09b783aada5d@linux.intel.com \
    --to=mathias.nyman@linux.intel.com \
    --cc=Thinh.Nguyen@synopsys.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=michal.pecio@gmail.com \
    --cc=niklas.neronin@linux.intel.com \
    --cc=oneukum@suse.com \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox