Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error

public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed

From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
To: "stern@rowland.harvard.edu" <stern@rowland.harvard.edu>
Cc: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
	Mathias Nyman <mathias.nyman@linux.intel.com>,
	"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>,
	"michal.pecio@gmail.com" <michal.pecio@gmail.com>,
	"oneukum@suse.com" <oneukum@suse.com>,
	"niklas.neronin@linux.intel.com" <niklas.neronin@linux.intel.com>
Subject: Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error
Date: Sat, 4 Apr 2026 01:15:52 +0000	[thread overview]
Message-ID: <20260404011530.aukxllvizvmc3f3x@synopsys.com> (raw)
In-Reply-To: <28a00143-85fa-4043-93a0-c2b687ff1bcd@rowland.harvard.edu>

On Thu, Apr 02, 2026, stern@rowland.harvard.edu wrote:
> On Fri, Apr 03, 2026 at 01:59:58AM +0000, Thinh Nguyen wrote:
> > On Wed, Apr 01, 2026, stern@rowland.harvard.edu wrote:
> > > On Thu, Apr 02, 2026 at 01:08:31AM +0300, Mathias Nyman wrote:
> > > > On 3/31/26 18:31, stern@rowland.harvard.edu wrote:
> > > > > 
> > > > > How about this instead?  We add a "halted" flag to the usb_host_endpoint
> > > > > structure, and the core will set this flag whenever a bulk or interrupt
> > > > > URB gets a status other than 0 (before putting the URB on the bh list).
> > > > > If an URB has one of these statuses, when its completion handler returns
> > > > > the core will unlink all the URBs queued to the same endpoint.  Finally,
> > > > > the "halted" flag should be cleared after a completion handler returns
> > > > > if there are no more unlinked URBs still in the queue or URBs waiting on
> > > > > the bh list to be given back.
> > > > > 
> > > > > The result of this is that any URB remaining in the queue when the flag
> > > > > is cleared must have been submitted by the class driver _after_ the
> > > > > failing URB's completion handler has run.  We can assume the class
> > > > > driver knows what it's doing in this case.
> > > > > 
> > > > > The endpoint queue shouldn't be restarted until the "halted" flag is
> > > > > cleared.  Either right away, if there are any URBs in the queue, or not
> > > > > until the next URB is submitted.  Doing this might require a new HCD
> > > > > callback.  (It would also mean the kerneldoc for usb_unlink_urb() would
> > > > > need to be updated, because the endpoint might restart before all the
> > > > > completion handlers for the unlinked URBs have run.)
> > > > > 
> > > > > What I'm trying to do here is come up with a single, consistent proposal
> > > > > for exactly when halted endpoint queues should restart.  Maybe someone
> > > > > else has a better suggestion.
> 
> > How about this:
> > 
> > Introduce a halted flag the following conditions:
> > 
> > * Introduce the halted flag in usb_host_endpoint
> > * The halted flag must be implemented as a bit in a unsigned long so
> >   we can use atomic bit operation
> 
> I don't see why it needs to use atomic operations.  Unless you're 
> thinking that we might want to add other bitflags into the same unsigned 
> long?

Now that I think about it again, it's not needed if we have the
requirement for clearing the flag only after usb_reset_endpoint.

> 
> > * Only the HCD may set the halted flag, and only upon checking the
> >   first URB completing with a halted status
> 
> Every status other than 0 should mean that the endpoint's queue is 
> halted.  But not all statuses require a clear-halt or reset-endpoint to 
> recover.  For instance, a short transfer when the URB_SHORT_NOT_OK flag 
> is set.

It seems we're using the "halted" flag for different things. The halted
flag to me is the endpoint's state and rather than the endpoint queue
state. That is, if the endpoint is halted, there's an error that was
reported on the pipe. So, an -ECONNRESET would not cause a halted
endpoint.

> 
> Why do you want the HCD to set the flag instead of 
> usb_hcd_giveback_urb()?  Just one central place is simpler than in every 
> HCD.

I'm just thinking in term of whose role to return the pending URBs.
Typically the HCD handles when to give back URBs. If the HCD were to
handle giving back pending URBs due to halted endpoint, then it should
be the one to set the halted flag. However, if the core tracks and does
both setting and clearing of the halted flag, then the core should
handle canceling the URBs. As you said, if we can shift the burden to
the core, that would help keep the flow consistent and centralized.

> 
> > * Only the USB core may clear the halted flag, and only after
> >   usb_reset_endpoint returns (this makes sure the HCD drained and reset
> >   the endpoint before the flag is cleared and new URBs are accepted)
> > * The usb_reset_endpoint must be called after clear-halt, SetInterface,
> >   and SetConfiguration.
> 
> That's easy to do just by adding it into the appropriate core routines.

Yes.

> 
> > * The USB core will not attempt to unlink pending URBs due to halted
> >   condition
> > * The HCD is responsible for completing or canceling queued URBs
> >   when the halted flag is set. Cancelled and newly submitted URBs will
> >   be returned with -EPIPE as long as the halted flag is set
> 
> Why make the HCD responsible for draining the queue?  It's like setting 
> the halted flag; one central place is simpler and less error-prone than 
> lots of separate places.
> 
> For newly submitted URBs, should the submission fail with -EPIPE, or 
> should the submission succeed and then the URB complete with -EPIPE?  
> The first is simpler, but drivers probably aren't prepared for it -- it 
> would mean adding error handling to the submission code as well as to 
> the completion handler code.
> 
> (Actually, that wouldn't matter unless the driver was queuing up 
> multiple URBs, in which case it should be prepared to handle submission 
> errors in the middle.)

I didn't think about failing immediately on submission because we would
need to audit every class driver for that. Perhaps the core can
intercept the URBs and immediately schedule a completion handler with
the updated URB's status?

> 
> > * The class driver is responsible to check the halted flag to
> >   determine whether to initiate error recovery via usb_clear_halt
> > 
> > I'm trying to keep a clear separation of responsibilities between HCD
> > and the USB core. Also, I try to keep the halted flag more closely match
> > the state of the endpoint. 
> > 
> > Let me know what you think?
> 
> Making the flag match the endpoint state is a good idea.  But there is 
> some ambiguity: Do you mean the state in the device, or the state in the 
> HC hardware, or the state in the HCD?  After all, these aren't always 
> the same.  For instance, unlinking an URB won't affect the device's 
> state but it will affect the state on the host side.
> 

The HCD state. If we can let the core manage the halted state, then here
are the points where the halted state is updated (hopefully I didn't
miss any other ones):

Endpoint is halted (reported by HCD):
* URB returned with -EPROTO or -EPIPE

Endpoint is active (updated by the core):
* SetConfiguration, SetInterface, clear-halt

Thanks,
Thinh

ps. I'll be out of town next week. Apologies if my response will be
delayed...

next prev parent reply	other threads:[~2026-04-04  1:16 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-23 12:25 [RFC PATCH 0/2] fix xhci endpoint restart at EPROTO Mathias Nyman
2026-03-23 12:25 ` [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error Mathias Nyman
2026-03-25  1:52   ` Thinh Nguyen
2026-03-25  9:38     ` Mathias Nyman
2026-03-26  1:19       ` Thinh Nguyen
2026-03-26 11:25         ` Mathias Nyman
2026-03-26 23:24           ` Thinh Nguyen
2026-03-30 12:51             ` Mathias Nyman
2026-03-30 14:17               ` stern
2026-03-31  9:34                 ` Mathias Nyman
2026-03-31 15:31                   ` stern
2026-04-01 22:08                     ` Mathias Nyman
2026-04-02  2:36                       ` stern
2026-04-03  1:59                         ` Thinh Nguyen
2026-04-03  2:42                           ` stern
2026-04-03  8:51                             ` Michal Pecio
2026-04-03 14:55                               ` stern
2026-04-03 19:13                                 ` xhci-hcd and URB_SHORT_NOT_OK Michal Pecio
2026-04-03 20:17                                   ` stern
2026-04-04  1:15                             ` Thinh Nguyen [this message]
2026-04-04  1:54                               ` [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error stern
2026-04-04 20:41                                 ` Thinh Nguyen
2026-04-04 21:54                                   ` Alan Stern
2026-04-04 22:15                                     ` Thinh Nguyen
2026-04-04 22:28                                       ` Thinh Nguyen
2026-04-05  1:30                                         ` Alan Stern
2026-04-05  3:10                                           ` Thinh Nguyen
2026-04-07 15:23                                             ` Alan Stern
2026-04-07 20:24                                               ` Mathias Nyman
2026-04-01 22:08               ` Thinh Nguyen
2026-04-01 22:34                 ` Mathias Nyman
2026-04-01 22:47                   ` Thinh Nguyen
2026-03-23 12:25 ` [RFC PATCH 2/2] xhci: Ensure URB is given back when endpoint halts on a multi-TD URB Mathias Nyman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260404011530.aukxllvizvmc3f3x@synopsys.com \
    --to=thinh.nguyen@synopsys.com \
    --cc=linux-usb@vger.kernel.org \
    --cc=mathias.nyman@linux.intel.com \
    --cc=michal.pecio@gmail.com \
    --cc=niklas.neronin@linux.intel.com \
    --cc=oneukum@suse.com \
    --cc=stern@rowland.harvard.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox