From: "Michał Pecio" <michal.pecio@gmail.com>
To: mathias.nyman@linux.intel.com
Cc: gregkh@linuxfoundation.org, linux-usb@vger.kernel.org
Subject: Re: [PATCH 12/15] xhci: Prevent early endpoint restart when handling STALL errors.
Date: Fri, 7 Mar 2025 07:54:29 +0100 [thread overview]
Message-ID: <20250307075429.5f9d1d4e@foxbook> (raw)
In-Reply-To: <20250306144954.3507700-13-mathias.nyman@linux.intel.com>
> Ensure that an endpoint halted due to device STALL is not
> restarted before a Clear_Feature(ENDPOINT_HALT) request is sent to
> the device.
>
> The host side of the endpoint may otherwise be started early by the
> 'Set TR Deq' command completion handler which is called if dequeue
> is moved past a cancelled or halted TD.
>
> Prevent this with a new flag set for bulk and interrupt endpoints
> when a Stall Error is received. Clear it in hcd->endpoint_reset()
> which is called after Clear_Feature(ENDPOINT_HALT) is sent.
>
> Also add a debug message if a class driver queues a new URB after
> the STALL. Note that class driver might not be aware of the STALL
> yet when it submits the URB as URBs are given back in BH.
>
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Sorry for coming this late, but I haven't looked closely at some
of those xhci/for-next patches before.
This one is unfortunately incomplete, as follows:
> drivers/usb/host/xhci-ring.c | 7 +++++--
> drivers/usb/host/xhci.c | 6 ++++++
> drivers/usb/host/xhci.h | 3 ++-
> 3 files changed, 13 insertions(+), 3 deletions(-)
>
>diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>index c2e15a27338b..7643ab9ec3b4 100644
>--- a/drivers/usb/host/xhci-ring.c
>+++ b/drivers/usb/host/xhci-ring.c
>@@ -556,8 +556,8 @@ void xhci_ring_ep_doorbell(struct xhci_hcd *xhci,
> * pointer command pending because the device can choose to start any
> * stream once the endpoint is on the HW schedule.
> */
>- if ((ep_state & EP_STOP_CMD_PENDING) || (ep_state & SET_DEQ_PENDING) ||
>- (ep_state & EP_HALTED) || (ep_state & EP_CLEARING_TT))
>+ if (ep_state & (EP_STOP_CMD_PENDING | SET_DEQ_PENDING | EP_HALTED |
>+ EP_CLEARING_TT | EP_STALLED))
> return;
Any flag added to this list needs to be added to xhci_urb_dequeue() too
so it knowns that the endpoint is held in Stopped state and URBs can be
unlinked without trying to stop it again.
There really should be a helper function used both here and there, but
those Stop EP patches were meant for stable and I strived to make them
small and noninvasive. Then I forgot about this cleanup.
NB: I also forgot about a bunch of low-impact halted EP handling bugs,
I will try to rebase and send them out today or over the weekend.
> trace_xhci_ring_ep_doorbell(slot_id, DB_VALUE(ep_index, stream_id));
> @@ -2555,6 +2555,9 @@ static void process_bulk_intr_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
>
> xhci_handle_halted_endpoint(xhci, ep, td, EP_SOFT_RESET);
> return;
> + case COMP_STALL_ERROR:
> + ep->ep_state |= EP_STALLED;
> + break;
> default:
> /* do nothing */
> break;
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 3f2cd546a7a2..0c22b78358b9 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1604,6 +1604,11 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag
> goto free_priv;
> }
>
> + /* Class driver might not be aware ep halted due to async URB giveback */
> + if (*ep_state & EP_STALLED)
> + dev_dbg(&urb->dev->dev, "URB %p queued before clearing halt\n",
> + urb);
> +
> switch (usb_endpoint_type(&urb->ep->desc)) {
>
> case USB_ENDPOINT_XFER_CONTROL:
> @@ -3202,6 +3207,7 @@ static void xhci_endpoint_reset(struct usb_hcd *hcd,
> return;
>
> ep = &vdev->eps[ep_index];
> + ep->ep_state &= ~EP_STALLED;
... and clearing any of those flags has always been followed by calling
xhci_ring_ep_doorbell() again, to ensure that the endpoint is restarted
if it has URBs on it but restart was held off due to the flag.
xhci_urb_dequeue() relies on this too, because it looked lke sensible
design: if you have reasons not to run the EP, you set a flag. Reasons
are gone, you clear the flag and it's running again.
> /* Bail out if toggle is already being cleared by a endpoint reset */
> spin_lock_irqsave(&xhci->lock, flags);
>diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
>index cd96e0a8c593..4ee14f651d36 100644
>--- a/drivers/usb/host/xhci.h
>+++ b/drivers/usb/host/xhci.h
>@@ -664,7 +664,7 @@ struct xhci_virt_ep {
> unsigned int err_count;
> unsigned int ep_state;
> #define SET_DEQ_PENDING (1 << 0)
>-#define EP_HALTED (1 << 1) /* For stall handling */
>+#define EP_HALTED (1 << 1) /* Halted host ep handling */
> #define EP_STOP_CMD_PENDING (1 << 2) /* For URB cancellation */
> /* Transitioning the endpoint to using streams, don't enqueue URBs */
> #define EP_GETTING_STREAMS (1 << 3)
>@@ -675,6 +675,7 @@ struct xhci_virt_ep {
> #define EP_SOFT_CLEAR_TOGGLE (1 << 7)
> /* usb_hub_clear_tt_buffer is in progress */
> #define EP_CLEARING_TT (1 << 8)
>+#define EP_STALLED (1 << 9) /* For stall handling */
I guess usage rules of those flags should be documented somewhere here
and helpers added such as:
xhci_ep_cancel_pending()
xhci_ep_held_stopped()
to improve maintainability and prevent similar problems in the future.
I could sit and write something, I still have this stuff quite fresh
in memory after spending a few weeks debugging those crazy HW races.
Regards,
Michal
next prev parent reply other threads:[~2025-03-07 6:54 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-06 14:49 [PATCH 00/15] xhci features for usb-next Mathias Nyman
2025-03-06 14:49 ` [PATCH 01/15] xhci: show correct U1 and U2 timeout values in debug messages Mathias Nyman
2025-03-06 14:49 ` [PATCH 02/15] usb: xhci: remove redundant update_ring_for_set_deq_completion() function Mathias Nyman
2025-03-06 14:49 ` [PATCH 03/15] usb: xhci: Don't skip on Stopped - Length Invalid Mathias Nyman
2025-03-06 14:52 ` Greg KH
2025-03-06 15:29 ` Mathias Nyman
2025-03-06 15:42 ` Greg KH
2025-03-06 14:49 ` [PATCH 04/15] usb: xhci: Complete 'error mid TD' transfers when handling Missed Service Mathias Nyman
2025-03-06 14:49 ` [PATCH 05/15] usb: xhci: Fix isochronous Ring Underrun/Overrun event handling Mathias Nyman
2025-03-06 14:49 ` [PATCH 06/15] usb: xhci: Expedite skipping missed isoch TDs on modern HCs Mathias Nyman
2025-03-06 14:49 ` [PATCH 07/15] usb: xhci: Skip only one TD on Ring Underrun/Overrun Mathias Nyman
2025-03-06 14:49 ` [PATCH 08/15] usb: xhci: correct debug message page size calculation Mathias Nyman
2025-03-06 14:49 ` [PATCH 09/15] usb: xhci: set page size to the xHCI-supported size Mathias Nyman
2025-03-06 14:49 ` [PATCH 10/15] usb: xhci: refactor trb_in_td() to be static Mathias Nyman
2025-03-06 14:49 ` [PATCH 11/15] usb: xhci: move debug capabilities from trb_in_td() to handle_tx_event() Mathias Nyman
2025-03-06 14:49 ` [PATCH 12/15] xhci: Prevent early endpoint restart when handling STALL errors Mathias Nyman
2025-03-07 6:54 ` Michał Pecio [this message]
2025-03-07 14:23 ` Mathias Nyman
2025-03-07 15:44 ` Michał Pecio
2025-03-07 16:18 ` Mathias Nyman
2025-03-06 14:49 ` [PATCH 13/15] usb: xhci: Apply the link chain quirk on NEC isoc endpoints Mathias Nyman
2025-03-06 14:49 ` [PATCH 14/15] usb: xhci: Unify duplicate inc_enq() code Mathias Nyman
2025-03-06 14:49 ` [PATCH 15/15] xhci: Handle spurious events on Etron host isoc enpoints Mathias Nyman
2025-03-07 8:27 ` Michał Pecio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250307075429.5f9d1d4e@foxbook \
--to=michal.pecio@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=linux-usb@vger.kernel.org \
--cc=mathias.nyman@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox