[RFC PATCH 0/2] fix xhci endpoint restart at EPROTO

public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH 0/2] fix xhci endpoint restart at EPROTO
@ 2026-03-23 12:25 Mathias Nyman
  2026-03-23 12:25 ` [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error Mathias Nyman
  2026-03-23 12:25 ` [RFC PATCH 2/2] xhci: Ensure URB is given back when endpoint halts on a multi-TD URB Mathias Nyman
  0 siblings, 2 replies; 8+ messages in thread
From: Mathias Nyman @ 2026-03-23 12:25 UTC (permalink / raw)
  To: linux-usb
  Cc: stern, Thinh.Nguyen, michal.pecio, oneukum, niklas.neronin,
	Mathias Nyman

Hi

These patches are related to the 'correctly handling EPROTO' discussion,
and aim to fix the xhci endpoint automatic restart issue.

It also makes sure that multi TD URBs are given back with proper
EPROTO/EPIPE status if a stall/transaction error is triggered on
any TD of the URB, not just the last one.

endpoints will still be restarted when a new URB is queued, this does
not fix the issue with BH URBs completion, but will print a debug message
if a new URB is queued before previous 'tainted' pending URBs are cancelled.

I haven't been able to test the actual codepaths in these patches yet.
Can't easily queue several URBs and somehow trigger stall or transaction
error on early URBs, but would like to get feeback on this solution early.

Thanks
Mathias

Mathias Nyman (2):
  xhci: prevent automatic endpoint restart after stall or error
  xhci: Ensure URB is given back when endpoint halts on a multi-TD URB

 drivers/usb/host/xhci-ring.c | 50 +++++++++++++++++++++++++++++++-----
 drivers/usb/host/xhci.c      |  7 +++++
 drivers/usb/host/xhci.h      |  4 ++-
 3 files changed, 53 insertions(+), 8 deletions(-)

-- 
2.43.0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error
  2026-03-23 12:25 [RFC PATCH 0/2] fix xhci endpoint restart at EPROTO Mathias Nyman
@ 2026-03-23 12:25 ` Mathias Nyman
  2026-03-25  1:52   ` Thinh Nguyen
  2026-03-23 12:25 ` [RFC PATCH 2/2] xhci: Ensure URB is given back when endpoint halts on a multi-TD URB Mathias Nyman
  1 sibling, 1 reply; 8+ messages in thread
From: Mathias Nyman @ 2026-03-23 12:25 UTC (permalink / raw)
  To: linux-usb
  Cc: stern, Thinh.Nguyen, michal.pecio, oneukum, niklas.neronin,
	Mathias Nyman

Avoid automatically restarting bulk or interrupt transfers after a
URB is given back due to stall or error.

Introduce a 'TD_TAINTED' state for pending TDs queued on a endpoint when
it halted. The actual TD the endpoint halted on is marked TD_HALTED,
and its URB is given back with proper EPROTO or EPIPE error code.

Don't automatically restart an endpoint if the next queued TD after
the TD_HALTED one is marked tainted.

class driver is expected to cancel all those pending URBs before
submitting more URBs, or before clearing halt, but is not forced to.
Always start the endpoint when a new URB is queued, but print a debug
message if there are still tainted URBs left uncancelled.

The same code is used for clearing up cached TDs from xHC in both
the URB stall/error case as when a class driver cancels URBs.
In the cancel case endpoint is restarted after removing the cancelled
URB.

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 48 ++++++++++++++++++++++++++++++------
 drivers/usb/host/xhci.c      |  7 ++++++
 drivers/usb/host/xhci.h      |  4 ++-
 3 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 1cbefee3c4ca..f01f343a7e37 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -981,6 +981,31 @@ static int xhci_reset_halted_ep(struct xhci_hcd *xhci, unsigned int slot_id,
 	return ret;
 }
 
+/*
+ * Mark the TD the endpoint halted on as TD_HALTED and add it to the cancelled
+ * td list, ensuring invalidate_cancelled_tds() clears the TD from xHC cache.
+ * Mark the other TDs on that bulk or interrupt endpoint as TD_TAINTED to
+ * prevent the ring from being restarted too early.
+ */
+
+static void xhci_cancel_halted_tds(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
+				   struct xhci_td *td)
+{
+	struct xhci_ring *ring;
+	struct xhci_td *tdi;
+
+	ring = xhci_urb_to_transfer_ring(xhci, td->urb);
+
+	td->cancel_status = TD_HALTED;
+	if (list_empty(&td->cancelled_td_list))
+		list_add_tail(&td->cancelled_td_list, &ep->cancelled_td_list);
+
+	list_for_each_entry(tdi, &ring->td_list, td_list) {
+		if (!tdi->cancel_status && ring->type != TYPE_CTRL)
+			tdi->cancel_status = TD_TAINTED;
+	}
+}
+
 static int xhci_handle_halted_endpoint(struct xhci_hcd *xhci,
 				struct xhci_virt_ep *ep,
 				struct xhci_td *td,
@@ -999,10 +1024,8 @@ static int xhci_handle_halted_endpoint(struct xhci_hcd *xhci,
 	/* add td to cancelled list and let reset ep handler take care of it */
 	if (reset_type == EP_HARD_RESET) {
 		ep->ep_state |= EP_HARD_CLEAR_TOGGLE;
-		if (td && list_empty(&td->cancelled_td_list)) {
-			list_add_tail(&td->cancelled_td_list, &ep->cancelled_td_list);
-			td->cancel_status = TD_HALTED;
-		}
+		if (td)
+			xhci_cancel_halted_tds(xhci, ep, td);
 	}
 
 	if (ep->ep_state & EP_HALTED) {
@@ -1079,6 +1102,7 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep)
 			case TD_CLEARED: /* TD is already no-op */
 			case TD_CLEARING_CACHE: /* set TR deq command already queued */
 				break;
+			case TD_TAINTED:
 			case TD_DIRTY: /* TD is cached, clear it */
 			case TD_HALTED:
 			case TD_CLEARING_CACHE_DEFERRED:
@@ -1406,6 +1430,14 @@ void xhci_hc_died(struct xhci_hcd *xhci)
 		usb_hc_died(xhci_to_hcd(xhci));
 }
 
+bool next_td_is_tainted(struct xhci_ring *ring)
+{
+	struct xhci_td *td;
+
+	td = list_first_entry_or_null(&ring->td_list, struct xhci_td, td_list);
+	return td && td->cancel_status == TD_TAINTED;
+}
+
 /*
  * When we get a completion for a Set Transfer Ring Dequeue Pointer command,
  * we need to clear the set deq pending flag in the endpoint ring state, so that
@@ -1543,13 +1575,15 @@ static void xhci_handle_cmd_set_deq(struct xhci_hcd *xhci, int slot_id,
 			 __func__);
 		xhci_invalidate_cancelled_tds(ep);
 		/* Try to restart the endpoint if all is done */
-		ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
+		if (!next_td_is_tainted(ep_ring))
+			ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
 		/* Start giving back any TDs invalidated above */
 		xhci_giveback_invalidated_tds(ep);
 	} else {
 		/* Restart any rings with pending URBs */
-		xhci_dbg(ep->xhci, "%s: All TDs cleared, ring doorbell\n", __func__);
-		ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
+		xhci_dbg(ep->xhci, "%s: All cancelled TDs cleared\n", __func__);
+		if (!next_td_is_tainted(ep_ring))
+			ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
 	}
 }
 
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index ef6d8662adec..8f417836ae1f 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -1625,6 +1625,7 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
 	unsigned long flags;
 	int ret = 0;
 	unsigned int slot_id, ep_index;
+	struct xhci_ring *ring;
 	unsigned int *ep_state;
 	struct urb_priv	*urb_priv;
 	int num_tds;
@@ -1661,6 +1662,7 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
 	}
 
 	slot_id = urb->dev->slot_id;
+	ring = xhci_urb_to_transfer_ring(xhci, urb);
 
 	if (!HCD_HW_ACCESSIBLE(hcd)) {
 		ret = -ESHUTDOWN;
@@ -1694,6 +1696,10 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
 		goto free_priv;
 	}
 
+	ring = xhci_urb_to_transfer_ring(xhci, urb);
+	if (ring && next_td_is_tainted(ring))
+		xhci_dbg(xhci, "WARN, enqueue URB without canceling old tainted URBs\n");
+
 	switch (usb_endpoint_type(&urb->ep->desc)) {
 
 	case USB_ENDPOINT_XFER_CONTROL:
@@ -3354,6 +3360,7 @@ static void xhci_endpoint_reset(struct usb_hcd *hcd,
 
 	/* Bail out if toggle is already being cleared by a endpoint reset */
 	spin_lock_irqsave(&xhci->lock, flags);
+
 	if (ep->ep_state & EP_HARD_CLEAR_TOGGLE) {
 		ep->ep_state &= ~EP_HARD_CLEAR_TOGGLE;
 		spin_unlock_irqrestore(&xhci->lock, flags);
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index 2b0796f6d00e..0be040f87abf 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1292,7 +1292,8 @@ struct xhci_segment {
 };
 
 enum xhci_cancelled_td_status {
-	TD_DIRTY = 0,
+	TD_TAINTED = 1,
+	TD_DIRTY,
 	TD_HALTED,
 	TD_CLEARING_CACHE,
 	TD_CLEARING_CACHE_DEFERRED,
@@ -1951,6 +1952,7 @@ void xhci_ring_doorbell_for_active_rings(struct xhci_hcd *xhci,
 void xhci_cleanup_command_queue(struct xhci_hcd *xhci);
 void inc_deq(struct xhci_hcd *xhci, struct xhci_ring *ring);
 unsigned int count_trbs(u64 addr, u64 len);
+bool next_td_is_tainted(struct xhci_ring *ring);
 int xhci_stop_endpoint_sync(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
 			    int suspend, gfp_t gfp_flags);
 void xhci_process_cancelled_tds(struct xhci_virt_ep *ep);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH 2/2] xhci: Ensure URB is given back when endpoint halts on a multi-TD URB
  2026-03-23 12:25 [RFC PATCH 0/2] fix xhci endpoint restart at EPROTO Mathias Nyman
  2026-03-23 12:25 ` [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error Mathias Nyman
@ 2026-03-23 12:25 ` Mathias Nyman
  1 sibling, 0 replies; 8+ messages in thread
From: Mathias Nyman @ 2026-03-23 12:25 UTC (permalink / raw)
  To: linux-usb
  Cc: stern, Thinh.Nguyen, michal.pecio, oneukum, niklas.neronin,
	Mathias Nyman

Make sure URB is given back by xhci driver after a stall (EPIPE)
or transaction error (EPROTO) on one of the TDs in the URB.

Most bulk, interrupt and control transfer URBs consist of one TD,
so adding that TD to the cancel list was enough to give back the
entire URB when invalidating and giving back cancelled TDs

Some bulk URBs that need an additional zero length packet consist
of two TDs. Make sure the second TD is added to the cancel list so
that URB is given back when clearing the internal endpoint halt.

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
---
 drivers/usb/host/xhci-ring.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index f01f343a7e37..5305ad7be025 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -982,8 +982,9 @@ static int xhci_reset_halted_ep(struct xhci_hcd *xhci, unsigned int slot_id,
 }
 
 /*
- * Mark the TD the endpoint halted on as TD_HALTED and add it to the cancelled
- * td list, ensuring invalidate_cancelled_tds() clears the TD from xHC cache.
+ * Mark the TD the endpoint halted on as TD_HALTED, add all TDs of that URB
+ * to the cancelled td list, ensuring invalidate_cancelled_tds() clears the
+ * TD from xHC cache and URB is given back.
  * Mark the other TDs on that bulk or interrupt endpoint as TD_TAINTED to
  * prevent the ring from being restarted too early.
  */
@@ -997,10 +998,11 @@ static void xhci_cancel_halted_tds(struct xhci_hcd *xhci, struct xhci_virt_ep *e
 	ring = xhci_urb_to_transfer_ring(xhci, td->urb);
 
 	td->cancel_status = TD_HALTED;
-	if (list_empty(&td->cancelled_td_list))
-		list_add_tail(&td->cancelled_td_list, &ep->cancelled_td_list);
 
 	list_for_each_entry(tdi, &ring->td_list, td_list) {
+		if (tdi->urb == td->urb && list_empty(&tdi->cancelled_td_list))
+			list_add_tail(&tdi->cancelled_td_list, &ep->cancelled_td_list);
+
 		if (!tdi->cancel_status && ring->type != TYPE_CTRL)
 			tdi->cancel_status = TD_TAINTED;
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error
  2026-03-23 12:25 ` [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error Mathias Nyman
@ 2026-03-25  1:52   ` Thinh Nguyen
  2026-03-25  9:38     ` Mathias Nyman
  0 siblings, 1 reply; 8+ messages in thread
From: Thinh Nguyen @ 2026-03-25  1:52 UTC (permalink / raw)
  To: Mathias Nyman
  Cc: linux-usb@vger.kernel.org, stern@rowland.harvard.edu,
	Thinh Nguyen, michal.pecio@gmail.com, oneukum@suse.com,
	niklas.neronin@linux.intel.com

On Mon, Mar 23, 2026, Mathias Nyman wrote:
> Avoid automatically restarting bulk or interrupt transfers after a
> URB is given back due to stall or error.
> 
> Introduce a 'TD_TAINTED' state for pending TDs queued on a endpoint when
> it halted. The actual TD the endpoint halted on is marked TD_HALTED,
> and its URB is given back with proper EPROTO or EPIPE error code.
> 
> Don't automatically restart an endpoint if the next queued TD after
> the TD_HALTED one is marked tainted.

Sounds good for -EPROTO, but will a clear-halt ring the corresponding
the endpoint's doorbell for STALL endpoint?

BR,
Thinh

> 
> class driver is expected to cancel all those pending URBs before
> submitting more URBs, or before clearing halt, but is not forced to.
> Always start the endpoint when a new URB is queued, but print a debug
> message if there are still tainted URBs left uncancelled.
> 
> The same code is used for clearing up cached TDs from xHC in both
> the URB stall/error case as when a class driver cancels URBs.
> In the cancel case endpoint is restarted after removing the cancelled
> URB.
> 
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> ---
>  drivers/usb/host/xhci-ring.c | 48 ++++++++++++++++++++++++++++++------
>  drivers/usb/host/xhci.c      |  7 ++++++
>  drivers/usb/host/xhci.h      |  4 ++-
>  3 files changed, 51 insertions(+), 8 deletions(-)
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error
  2026-03-25  1:52   ` Thinh Nguyen
@ 2026-03-25  9:38     ` Mathias Nyman
  2026-03-26  1:19       ` Thinh Nguyen
  0 siblings, 1 reply; 8+ messages in thread
From: Mathias Nyman @ 2026-03-25  9:38 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: linux-usb@vger.kernel.org, stern@rowland.harvard.edu,
	michal.pecio@gmail.com, oneukum@suse.com,
	niklas.neronin@linux.intel.com

On 3/25/26 03:52, Thinh Nguyen wrote:
> On Mon, Mar 23, 2026, Mathias Nyman wrote:
>> Avoid automatically restarting bulk or interrupt transfers after a
>> URB is given back due to stall or error.
>>
>> Introduce a 'TD_TAINTED' state for pending TDs queued on a endpoint when
>> it halted. The actual TD the endpoint halted on is marked TD_HALTED,
>> and its URB is given back with proper EPROTO or EPIPE error code.
>>
>> Don't automatically restart an endpoint if the next queued TD after
>> the TD_HALTED one is marked tainted.
> 
> Sounds good for -EPROTO, but will a clear-halt ring the corresponding
> the endpoint's doorbell for STALL endpoint?
> 

With this change xhci would not restart the stalled endpoint after a clear-halt
request. The first usb_enqueue() call after clear-halt would start it.

Could make sense to restart the endpoint after a clear-halt, and just add a small
debug message if the next queued URB is marked 'tainted'.

Thanks for the feedback
Mathias


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error
  2026-03-25  9:38     ` Mathias Nyman
@ 2026-03-26  1:19       ` Thinh Nguyen
  2026-03-26 11:25         ` Mathias Nyman
  0 siblings, 1 reply; 8+ messages in thread
From: Thinh Nguyen @ 2026-03-26  1:19 UTC (permalink / raw)
  To: Mathias Nyman
  Cc: Thinh Nguyen, linux-usb@vger.kernel.org,
	stern@rowland.harvard.edu, michal.pecio@gmail.com,
	oneukum@suse.com, niklas.neronin@linux.intel.com

On Wed, Mar 25, 2026, Mathias Nyman wrote:
> On 3/25/26 03:52, Thinh Nguyen wrote:
> > On Mon, Mar 23, 2026, Mathias Nyman wrote:
> > > Avoid automatically restarting bulk or interrupt transfers after a
> > > URB is given back due to stall or error.
> > > 
> > > Introduce a 'TD_TAINTED' state for pending TDs queued on a endpoint when
> > > it halted. The actual TD the endpoint halted on is marked TD_HALTED,
> > > and its URB is given back with proper EPROTO or EPIPE error code.
> > > 
> > > Don't automatically restart an endpoint if the next queued TD after
> > > the TD_HALTED one is marked tainted.
> > 
> > Sounds good for -EPROTO, but will a clear-halt ring the corresponding
> > the endpoint's doorbell for STALL endpoint?
> > 
> 
> With this change xhci would not restart the stalled endpoint after a clear-halt
> request. The first usb_enqueue() call after clear-halt would start it.
> 
> Could make sense to restart the endpoint after a clear-halt, and just add a small
> debug message if the next queued URB is marked 'tainted'.
> 

The -EPROTO should be handled differently than -EPIPE. A STALL endpoint
is part of a normal usb flow. Should the class driver submit a new URB
while the endpoint is STALL, we would always expect a STALL error
response after the endpoint is restarted. That's not the case with
-EPROTO where the data may be corrupted and/or the host and device are
out of sync. We should not continue until the class driver do some
recovery. IMHO, we can keep the handling of -EPIPE how it was before.
Let the xHC tell whether the STALL error still persists instead of
managing it by the xhci driver.

Thanks,
Thinh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error
  2026-03-26  1:19       ` Thinh Nguyen
@ 2026-03-26 11:25         ` Mathias Nyman
  2026-03-26 23:24           ` Thinh Nguyen
  0 siblings, 1 reply; 8+ messages in thread
From: Mathias Nyman @ 2026-03-26 11:25 UTC (permalink / raw)
  To: Thinh Nguyen
  Cc: linux-usb@vger.kernel.org, stern@rowland.harvard.edu,
	michal.pecio@gmail.com, oneukum@suse.com,
	niklas.neronin@linux.intel.com

On 3/26/26 03:19, Thinh Nguyen wrote:
> On Wed, Mar 25, 2026, Mathias Nyman wrote:
>> On 3/25/26 03:52, Thinh Nguyen wrote:
>>> On Mon, Mar 23, 2026, Mathias Nyman wrote:
>>>> Avoid automatically restarting bulk or interrupt transfers after a
>>>> URB is given back due to stall or error.
>>>>
>>>> Introduce a 'TD_TAINTED' state for pending TDs queued on a endpoint when
>>>> it halted. The actual TD the endpoint halted on is marked TD_HALTED,
>>>> and its URB is given back with proper EPROTO or EPIPE error code.
>>>>
>>>> Don't automatically restart an endpoint if the next queued TD after
>>>> the TD_HALTED one is marked tainted.
>>>
>>> Sounds good for -EPROTO, but will a clear-halt ring the corresponding
>>> the endpoint's doorbell for STALL endpoint?
>>>
>>
>> With this change xhci would not restart the stalled endpoint after a clear-halt
>> request. The first usb_enqueue() call after clear-halt would start it.
>>
>> Could make sense to restart the endpoint after a clear-halt, and just add a small
>> debug message if the next queued URB is marked 'tainted'.
>>
> 
> The -EPROTO should be handled differently than -EPIPE. A STALL endpoint
> is part of a normal usb flow. Should the class driver submit a new URB
> while the endpoint is STALL, we would always expect a STALL error
> response after the endpoint is restarted. That's not the case with
> -EPROTO where the data may be corrupted and/or the host and device are
> out of sync. We should not continue until the class driver do some
> recovery. IMHO, we can keep the handling of -EPIPE how it was before.
> Let the xHC tell whether the STALL error still persists instead of
> managing it by the xhci driver.
> 
I agree that that we should restart the endpoint if class/core enqueues a new
URB _after_ xhci gave back an URB with EPIPE after endpoint STALL.

But I don't think we should restart the ring to continue processing URBs that
were queued before the endpoint stalled. This would prevent the class/core
from even attempting to retire the pending URBs, something USB2.0 spec,
'5.8.5 Bulk Transfer Data Sequences' requires:

"If a halt condition is detected on a bulk pipe due to transmission errors or
  a STALL handshake being returned from the endpoint, all pending IRPs are
  retired.  Removal of the halt condition is achieved via software intervention
  through a separate control pipe."

Thanks
Mathias

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error
  2026-03-26 11:25         ` Mathias Nyman
@ 2026-03-26 23:24           ` Thinh Nguyen
  0 siblings, 0 replies; 8+ messages in thread
From: Thinh Nguyen @ 2026-03-26 23:24 UTC (permalink / raw)
  To: Mathias Nyman
  Cc: Thinh Nguyen, linux-usb@vger.kernel.org,
	stern@rowland.harvard.edu, michal.pecio@gmail.com,
	oneukum@suse.com, niklas.neronin@linux.intel.com

On Thu, Mar 26, 2026, Mathias Nyman wrote:
> On 3/26/26 03:19, Thinh Nguyen wrote:
> > On Wed, Mar 25, 2026, Mathias Nyman wrote:
> > > On 3/25/26 03:52, Thinh Nguyen wrote:
> > > > On Mon, Mar 23, 2026, Mathias Nyman wrote:
> > > > > Avoid automatically restarting bulk or interrupt transfers after a
> > > > > URB is given back due to stall or error.
> > > > > 
> > > > > Introduce a 'TD_TAINTED' state for pending TDs queued on a endpoint when
> > > > > it halted. The actual TD the endpoint halted on is marked TD_HALTED,
> > > > > and its URB is given back with proper EPROTO or EPIPE error code.
> > > > > 
> > > > > Don't automatically restart an endpoint if the next queued TD after
> > > > > the TD_HALTED one is marked tainted.
> > > > 
> > > > Sounds good for -EPROTO, but will a clear-halt ring the corresponding
> > > > the endpoint's doorbell for STALL endpoint?
> > > > 
> > > 
> > > With this change xhci would not restart the stalled endpoint after a clear-halt
> > > request. The first usb_enqueue() call after clear-halt would start it.
> > > 
> > > Could make sense to restart the endpoint after a clear-halt, and just add a small
> > > debug message if the next queued URB is marked 'tainted'.
> > > 
> > 
> > The -EPROTO should be handled differently than -EPIPE. A STALL endpoint
> > is part of a normal usb flow. Should the class driver submit a new URB
> > while the endpoint is STALL, we would always expect a STALL error
> > response after the endpoint is restarted. That's not the case with
> > -EPROTO where the data may be corrupted and/or the host and device are
> > out of sync. We should not continue until the class driver do some
> > recovery. IMHO, we can keep the handling of -EPIPE how it was before.
> > Let the xHC tell whether the STALL error still persists instead of
> > managing it by the xhci driver.
> > 
> I agree that that we should restart the endpoint if class/core enqueues a new
> URB _after_ xhci gave back an URB with EPIPE after endpoint STALL.
> 
> But I don't think we should restart the ring to continue processing URBs that
> were queued before the endpoint stalled. This would prevent the class/core
> from even attempting to retire the pending URBs, something USB2.0 spec,
> '5.8.5 Bulk Transfer Data Sequences' requires:
> 
> "If a halt condition is detected on a bulk pipe due to transmission errors or
>  a STALL handshake being returned from the endpoint, all pending IRPs are
>  retired.  Removal of the halt condition is achieved via software intervention
>  through a separate control pipe."
> 

Fair point. Then the core will need to track the endpoint's STALL state
and parse the clear-halt request to know which endpoint and when to
clear the STALL before it can accept new URB. So the first usb_enqueue()
call after clear-halt can start the endpoint again. The xhci will also
need to have access to this state.

Currently you have the xhci driver to "retire" the halted URBs. However,
you also noted that class/core may attempt to retire the pending URBs.
Who's expected to handle the retirement here?

On a separate note, will you plan to implement the clear-halt for EPROTO
in xhci?

Thanks!
Thinh

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-03-26 23:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-23 12:25 [RFC PATCH 0/2] fix xhci endpoint restart at EPROTO Mathias Nyman
2026-03-23 12:25 ` [RFC PATCH 1/2] xhci: prevent automatic endpoint restart after stall or error Mathias Nyman
2026-03-25  1:52   ` Thinh Nguyen
2026-03-25  9:38     ` Mathias Nyman
2026-03-26  1:19       ` Thinh Nguyen
2026-03-26 11:25         ` Mathias Nyman
2026-03-26 23:24           ` Thinh Nguyen
2026-03-23 12:25 ` [RFC PATCH 2/2] xhci: Ensure URB is given back when endpoint halts on a multi-TD URB Mathias Nyman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox