From: "Michał Pecio" <michal.pecio@gmail.com>
To: mathias.nyman@linux.intel.com
Cc: gregkh@linuxfoundation.org, ki.chiang65@gmail.com,
linux-usb@vger.kernel.org, michal.pecio@gmail.com
Subject: Re: [PATCH 15/15] xhci: Handle spurious events on Etron host isoc enpoints
Date: Fri, 7 Mar 2025 09:27:02 +0100 [thread overview]
Message-ID: <20250307092702.5f958e92@foxbook> (raw)
In-Reply-To: <20250306144954.3507700-16-mathias.nyman@linux.intel.com>
> Unplugging a USB3.0 webcam from Etron hosts while streaming results
> in errors like this:
>
> [ 2.646387] xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr
> not part of current TD ep_index 18 comp_code 13 [ 2.646446] xhci_hcd
> 0000:03:00.0: Looking for event-dma 000000002fdf8630 trb-start
> 000000002fdf8640 trb-end 000000002fdf8650 [ 2.646560] xhci_hcd
> 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD
> ep_index 18 comp_code 13 [ 2.646568] xhci_hcd 0000:03:00.0: Looking
> for event-dma 000000002fdf8660 trb-start 000000002fdf8670 trb-end
> 000000002fdf8670
>
> Etron xHC generates two transfer events for the TRB if an error is
> detected while processing the last TRB of an isoc TD.
>
> The first event can be any sort of error (like USB Transaction or
> Babble Detected, etc), and the final event is Success.
>
> The xHCI driver will handle the TD after the first event and remove
> it from its internal list, and then print an "Transfer event TRB DMA
> ptr not part of current TD" error message after the final event.
>
> Commit 5372c65e1311 ("xhci: process isoc TD properly when there was a
> transaction error mid TD.") is designed to address isoc transaction
> errors, but unfortunately it doesn't account for this scenario.
>
> This issue is similar to the XHCI_SPURIOUS_SUCCESS case where a
> success event follows a 'short transfer' event, but the TD the event
> points to is already given back.
>
> Expand the spurious success 'short transfer' event handling to cover
> the spurious success after error on Etron hosts.
>
> Kuangyi Chiang reported this issue and submitted a different solution
> based on using error_mid_td. This commit message is mostly taken
> from that patch.
>
> Reported-by: Kuangyi Chiang <ki.chiang65@gmail.com>
> Closes:
> https://lore.kernel.org/linux-usb/20241028025337.6372-6-ki.chiang65@gmail.com/
> Tested-by: Kuangyi Chiang <ki.chiang65@gmail.com> Tested-by: Michal
> Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman
> <mathias.nyman@linux.intel.com>
Such simple HW quirk would be an abvious candidate for stable if
a Short Packet refactor weren't bundled with it.
And it is subtly broken. I could swear that I have mailed you about
it, maybe you missed it or I didn't explain myself clearly enough.
> ---
> drivers/usb/host/xhci-ring.c | 38 ++++++++++++++++++++++++------------
> drivers/usb/host/xhci.h | 2 +-
> 2 files changed, 27 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 2df94ed3152c..0f8acbb9cd21 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -2611,6 +2611,22 @@ static int handle_transferless_tx_event(struct xhci_hcd *xhci, struct xhci_virt_
> return 0;
> }
>
> +static bool xhci_spurious_success_tx_event(struct xhci_hcd *xhci,
> + struct xhci_ring *ring)
> +{
> + switch (ring->old_trb_comp_code) {
> + case COMP_SHORT_PACKET:
> + return xhci->quirks & XHCI_SPURIOUS_SUCCESS;
XHCI_SPURIOUS_SUCCESS applies to practically all HCs, so this code
will typically boil down to:
return (ring->old_trb_comp_code == COMP_SHORT_PACKET);
> + case COMP_USB_TRANSACTION_ERROR:
> + case COMP_BABBLE_DETECTED_ERROR:
> + case COMP_ISOCH_BUFFER_OVERRUN:
> + return xhci->quirks & XHCI_ETRON_HOST &&
> + ring->type == TYPE_ISOC;
> + default:
> + return false;
> + }
> +}
> +
> /*
> * If this function returns an error condition, it means it got a Transfer
> * event with a corrupted Slot ID, Endpoint ID, or TRB DMA address.
> @@ -2665,8 +2681,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
> case COMP_SUCCESS:
> if (EVENT_TRB_LEN(le32_to_cpu(event->transfer_len)) != 0) {
> trb_comp_code = COMP_SHORT_PACKET;
> - xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td short %d\n",
> - slot_id, ep_index, ep_ring->last_td_was_short);
> + xhci_dbg(xhci, "Successful completion on short TX for slot %u ep %u with last td comp code %d\n",
> + slot_id, ep_index, ep_ring->old_trb_comp_code);
> }
> break;
> case COMP_SHORT_PACKET:
> @@ -2817,7 +2833,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
> if (trb_comp_code != COMP_STOPPED &&
> trb_comp_code != COMP_STOPPED_LENGTH_INVALID &&
> !ring_xrun_event &&
> - !ep_ring->last_td_was_short) {
> + !xhci_spurious_success_tx_event(xhci, ep_ring)) {
> xhci_warn(xhci, "Event TRB for slot %u ep %u with no TDs queued\n",
> slot_id, ep_index);
> }
> @@ -2882,11 +2898,12 @@ static int handle_tx_event(struct xhci_hcd *xhci,
>
> /*
> * Some hosts give a spurious success event after a short
> - * transfer. Ignore it.
> + * transfer or error on last TRB. Ignore it.
> */
> - if ((xhci->quirks & XHCI_SPURIOUS_SUCCESS) &&
> - ep_ring->last_td_was_short) {
> - ep_ring->last_td_was_short = false;
'last_td_was_short' means "expect one more event", and it is being
cleared here after receiving said event, or at least suspecting so.
> + if (xhci_spurious_success_tx_event(xhci, ep_ring)) {
> + xhci_dbg(xhci, "Spurious event dma %pad, comp_code %u after %u\n",
> + &ep_trb_dma, trb_comp_code, ep_ring->old_trb_comp_code);
> + ep_ring->old_trb_comp_code = trb_comp_code;
Proper equivalent here would be to reset old_trb_comp_code to some
"impossible" value (0, -1) so that xhci_spurious_success_tx_event()
ceases returning true. Otherwise, this branch will trigger again on
the next event if it's for a wrong transfer (dangerous HW or SW bug).
Specifically and explicitly, two problems are created:
1. The "one more event" we expect will always be COMP_SHORT_PACKET,
so this code will keep silently ignoring invalid events until some
event is handled without error or is other than Short Packet.
2. There are endpoints (e.g. async/adaptive audio, usb-serial IN, IIRC
some UAS too) where all or most transfers complete with Short Packet
as a matter of routine. This code will silently ignore errors until
an event is handled without error, so it will ignore all errors.
IOW, "TRB DMA ptr not part of current TD" can never show up as far
as I can tell.
> return 0;
> }
>
> @@ -2909,15 +2926,12 @@ static int handle_tx_event(struct xhci_hcd *xhci,
> */
> } while (ep->skip);
>
> + ep_ring->old_trb_comp_code = trb_comp_code;
> +
> /* Get out if a TD was queued at enqueue after the xrun occurred */
> if (ring_xrun_event)
> return 0;
>
> - if (trb_comp_code == COMP_SHORT_PACKET)
> - ep_ring->last_td_was_short = true;
> - else
> - ep_ring->last_td_was_short = false;
> -
> ep_trb = &ep_seg->trbs[(ep_trb_dma - ep_seg->dma) / sizeof(*ep_trb)];
> trace_xhci_handle_transfer(ep_ring, (struct xhci_generic_trb *) ep_trb, ep_trb_dma);
>
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index d9d7cd1906f3..6c00062a9acc 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1375,7 +1375,7 @@ struct xhci_ring {
> unsigned int num_trbs_free; /* used only by xhci DbC */
> unsigned int bounce_buf_len;
> enum xhci_ring_type type;
> - bool last_td_was_short;
> + u32 old_trb_comp_code;
> struct radix_tree_root *trb_address_map;
> };
>
> --
prev parent reply other threads:[~2025-03-07 8:27 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-06 14:49 [PATCH 00/15] xhci features for usb-next Mathias Nyman
2025-03-06 14:49 ` [PATCH 01/15] xhci: show correct U1 and U2 timeout values in debug messages Mathias Nyman
2025-03-06 14:49 ` [PATCH 02/15] usb: xhci: remove redundant update_ring_for_set_deq_completion() function Mathias Nyman
2025-03-06 14:49 ` [PATCH 03/15] usb: xhci: Don't skip on Stopped - Length Invalid Mathias Nyman
2025-03-06 14:52 ` Greg KH
2025-03-06 15:29 ` Mathias Nyman
2025-03-06 15:42 ` Greg KH
2025-03-06 14:49 ` [PATCH 04/15] usb: xhci: Complete 'error mid TD' transfers when handling Missed Service Mathias Nyman
2025-03-06 14:49 ` [PATCH 05/15] usb: xhci: Fix isochronous Ring Underrun/Overrun event handling Mathias Nyman
2025-03-06 14:49 ` [PATCH 06/15] usb: xhci: Expedite skipping missed isoch TDs on modern HCs Mathias Nyman
2025-03-06 14:49 ` [PATCH 07/15] usb: xhci: Skip only one TD on Ring Underrun/Overrun Mathias Nyman
2025-03-06 14:49 ` [PATCH 08/15] usb: xhci: correct debug message page size calculation Mathias Nyman
2025-03-06 14:49 ` [PATCH 09/15] usb: xhci: set page size to the xHCI-supported size Mathias Nyman
2025-03-06 14:49 ` [PATCH 10/15] usb: xhci: refactor trb_in_td() to be static Mathias Nyman
2025-03-06 14:49 ` [PATCH 11/15] usb: xhci: move debug capabilities from trb_in_td() to handle_tx_event() Mathias Nyman
2025-03-06 14:49 ` [PATCH 12/15] xhci: Prevent early endpoint restart when handling STALL errors Mathias Nyman
2025-03-07 6:54 ` Michał Pecio
2025-03-07 14:23 ` Mathias Nyman
2025-03-07 15:44 ` Michał Pecio
2025-03-07 16:18 ` Mathias Nyman
2025-03-06 14:49 ` [PATCH 13/15] usb: xhci: Apply the link chain quirk on NEC isoc endpoints Mathias Nyman
2025-03-06 14:49 ` [PATCH 14/15] usb: xhci: Unify duplicate inc_enq() code Mathias Nyman
2025-03-06 14:49 ` [PATCH 15/15] xhci: Handle spurious events on Etron host isoc enpoints Mathias Nyman
2025-03-07 8:27 ` Michał Pecio [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250307092702.5f958e92@foxbook \
--to=michal.pecio@gmail.com \
--cc=gregkh@linuxfoundation.org \
--cc=ki.chiang65@gmail.com \
--cc=linux-usb@vger.kernel.org \
--cc=mathias.nyman@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox