* [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements
@ 2025-02-26 7:02 Michal Pecio
2025-02-26 7:02 ` [PATCH v3 1/5] usb: xhci: Don't skip on Stopped - Length Invalid Michal Pecio
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: Michal Pecio @ 2025-02-26 7:02 UTC (permalink / raw)
To: Mathias Nyman, Greg Kroah-Hartman; +Cc: Niklas Neronin, linux-usb, linux-kernel
These patches reduce latency of error reporting in some cases related
to 'error mid TD' and Missed Service events and sometimes fix a failure
to give back such TDs altogether until they are cancelled.
Also included are fixes for potential packet loss or memory corruption
due to obscure races. Whether it causes problems IRL is not known and
the worst case would be hard to reproduce, but exactly for this reason
if the worst case actually happens, it could be hard to debug too.
The first three should be safe. The fourth should also be safe, but it
relies on HC functionality Linux never relied on before, so I placed it
towards the end in case it would need some tweaks. I tested it on all
hardware I have and it worked just fine.
The last one is perhaps the most controversial, though it should be
OK with typical "complete -> resubmit" drivers. It's the only one here
which increases latency in some severe error cases. The intent is to
avoid potentially giving back URBs not yet executed by hardware.
New in v3:
- dropped the cleanup patch
- added Don't skip on Stopped - Length Invalid
New in v3:
- fixed spurious empty list warning on xrun
- clear skip flag if one skipped event was the last
Michal Pecio (5):
usb: xhci: Don't skip on Stopped - Length Invalid
usb: xhci: Complete 'error mid TD' transfers when handling Missed
Service
usb: xhci: Fix isochronous Ring Underrun/Overrun event handling
usb: xhci: Expedite skipping missed isoch TDs on modern HCs
usb: xhci: Skip only one TD on Ring Underrun/Overrun
drivers/usb/host/xhci-ring.c | 55 +++++++++++++++++++++++++++++-------
1 file changed, 45 insertions(+), 10 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v3 1/5] usb: xhci: Don't skip on Stopped - Length Invalid
2025-02-26 7:02 [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Michal Pecio
@ 2025-02-26 7:02 ` Michal Pecio
2025-02-26 22:38 ` Michał Pecio
2025-02-26 7:03 ` [PATCH v3 2/5] usb: xhci: Complete 'error mid TD' transfers when handling Missed Service Michal Pecio
` (4 subsequent siblings)
5 siblings, 1 reply; 10+ messages in thread
From: Michal Pecio @ 2025-02-26 7:02 UTC (permalink / raw)
To: Mathias Nyman, Greg Kroah-Hartman; +Cc: Niklas Neronin, linux-usb, linux-kernel
Up until commit d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are
returned when isoc ring is stopped") in v6.11, the driver didn't skip
missed isochronous TDs when handling Stoppend and Stopped - Length
Invalid events. Instead, it erroneously cleared the skip flag, which
would cause the ring to get stuck, as future events won't match the
missed TD which is never removed from the queue until it's cancelled.
This buggy logic seems to have been in place substantially unchanged
since the 3.x series over 10 years ago, which probably speaks first
and foremost about relative rarity of this case in normal usage, but
by the spec I see no reason why it shouldn't be possible.
After d56b0b2ab142, TDs are immediately skipped when handling those
Stopped events. This poses a potential problem in case of Stopped -
Length Invalid, which occurs either on completed TDs (likely already
given back) or Link and No-Op TRBs. Such event won't be recognized
as matching any TD (unless it's the rare Link TRB inside a TD) and
will result in skipping all pending TDs, giving them back possibly
before they are done, risking isoc data loss and maybe UAF by HW.
As a compromise, don't skip and don't clear the skip flag on this
kind of event. Then the next event will skip missed TDs. A downside
of not handling Stopped - Length Invalid on a Link inside a TD is
that if the TD is cancelled, its actual length will not be updated
to account for TRBs (silently) completed before the TD was stopped.
I had no luck producing this sequence of completion events so there
is no compelling demonstration of any resulting disaster. It may be
a very rare, obscure condition. The sole motivation for this patch
is that if such unlikely event does occur, I'd rather risk reporting
a cancelled partially done isoc frame as empty than gamble with UAF.
This will be fixed more properly by looking at Stopped event's TRB
pointer when making skipping decisions, but such rework is unlikely
to be backported to v6.12, which will stay around for a few years.
Fixes: d56b0b2ab142 ("usb: xhci: ensure skipped isoc TDs are returned when isoc ring is stopped")
Cc: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
---
drivers/usb/host/xhci-ring.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 67f3d8128b10..96b90819aec7 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2870,6 +2870,10 @@ static int handle_tx_event(struct xhci_hcd *xhci,
if (!ep_seg) {
if (ep->skip && usb_endpoint_xfer_isoc(&td->urb->ep->desc)) {
+ /* this event is unlikely to match any TD, don't skip them all */
+ if (trb_comp_code == COMP_STOPPED_LENGTH_INVALID)
+ return 0;
+
skip_isoc_td(xhci, td, ep, status);
if (!list_empty(&ep_ring->td_list))
continue;
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 2/5] usb: xhci: Complete 'error mid TD' transfers when handling Missed Service
2025-02-26 7:02 [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Michal Pecio
2025-02-26 7:02 ` [PATCH v3 1/5] usb: xhci: Don't skip on Stopped - Length Invalid Michal Pecio
@ 2025-02-26 7:03 ` Michal Pecio
2025-02-26 7:04 ` [PATCH v3 3/5] usb: xhci: Fix isochronous Ring Underrun/Overrun event handling Michal Pecio
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Michal Pecio @ 2025-02-26 7:03 UTC (permalink / raw)
To: Mathias Nyman, Greg Kroah-Hartman; +Cc: Niklas Neronin, linux-usb, linux-kernel
Missed Service Error after an error mid TD means that the failed TD has
already been passed by the xHC without acknowledgment of the final TRB,
a known hardware bug. So don't wait any more and give back the TD.
Reproduced on NEC uPD720200 under conditions of ludicrously bad USB link
quality, confirmed to behave as expected using dynamic debug.
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
---
drivers/usb/host/xhci-ring.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 96b90819aec7..5eaf4f9154b9 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2794,7 +2794,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
xhci_dbg(xhci,
"Miss service interval error for slot %u ep %u, set skip flag\n",
slot_id, ep_index);
- return 0;
+ break;
case COMP_NO_PING_RESPONSE_ERROR:
ep->skip = true;
xhci_dbg(xhci,
@@ -2842,6 +2842,10 @@ static int handle_tx_event(struct xhci_hcd *xhci,
xhci_dequeue_td(xhci, td, ep_ring, td->status);
}
+ /* Missed TDs will be skipped on the next event */
+ if (trb_comp_code == COMP_MISSED_SERVICE_ERROR)
+ return 0;
+
if (list_empty(&ep_ring->td_list)) {
/*
* Don't print wanings if ring is empty due to a stopped endpoint generating an
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 3/5] usb: xhci: Fix isochronous Ring Underrun/Overrun event handling
2025-02-26 7:02 [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Michal Pecio
2025-02-26 7:02 ` [PATCH v3 1/5] usb: xhci: Don't skip on Stopped - Length Invalid Michal Pecio
2025-02-26 7:03 ` [PATCH v3 2/5] usb: xhci: Complete 'error mid TD' transfers when handling Missed Service Michal Pecio
@ 2025-02-26 7:04 ` Michal Pecio
2025-02-26 7:05 ` [PATCH v3 4/5] usb: xhci: Expedite skipping missed isoch TDs on modern HCs Michal Pecio
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Michal Pecio @ 2025-02-26 7:04 UTC (permalink / raw)
To: Mathias Nyman, Greg Kroah-Hartman; +Cc: Niklas Neronin, linux-usb, linux-kernel
The TRB pointer of these events points at enqueue at the time of error
occurrence on xHCI 1.1+ HCs or it's NULL on older ones. By the time we
are handling the event, a new TD may be queued at this ring position.
I can trigger this race by rising interrupt moderation to increase IRQ
handling delay. Similar delay may occur naturally due to system load.
If this ever happens after a Missed Service Error, missed TDs will be
skipped and the new TD processed as if it matched the event. It could
be given back prematurely, risking data loss or buffer UAF by the xHC.
Don't complete TDs on xrun events and don't warn if queued TDs don't
match the event's TRB pointer, which can be NULL or a link/no-op TRB.
Don't warn if there are no queued TDs at all.
Now that it's safe, also handle xrun events if the skip flag is clear.
This ensures completion of any TD stuck in 'error mid TD' state right
before the xrun event, which could happen if a driver submits a finite
number of URBs to a buggy HC and then an error occurs on the last TD.
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
---
drivers/usb/host/xhci-ring.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 5eaf4f9154b9..995f8a9b5b53 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2669,6 +2669,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
int status = -EINPROGRESS;
struct xhci_ep_ctx *ep_ctx;
u32 trb_comp_code;
+ bool ring_xrun_event = false;
slot_id = TRB_TO_SLOT_ID(le32_to_cpu(event->flags));
ep_index = TRB_TO_EP_ID(le32_to_cpu(event->flags)) - 1;
@@ -2775,14 +2776,12 @@ static int handle_tx_event(struct xhci_hcd *xhci,
* Underrun Event for OUT Isoch endpoint.
*/
xhci_dbg(xhci, "Underrun event on slot %u ep %u\n", slot_id, ep_index);
- if (ep->skip)
- break;
- return 0;
+ ring_xrun_event = true;
+ break;
case COMP_RING_OVERRUN:
xhci_dbg(xhci, "Overrun event on slot %u ep %u\n", slot_id, ep_index);
- if (ep->skip)
- break;
- return 0;
+ ring_xrun_event = true;
+ break;
case COMP_MISSED_SERVICE_ERROR:
/*
* When encounter missed service error, one or more isoc tds
@@ -2855,6 +2854,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
*/
if (trb_comp_code != COMP_STOPPED &&
trb_comp_code != COMP_STOPPED_LENGTH_INVALID &&
+ !ring_xrun_event &&
!ep_ring->last_td_was_short) {
xhci_warn(xhci, "Event TRB for slot %u ep %u with no TDs queued\n",
slot_id, ep_index);
@@ -2889,6 +2889,10 @@ static int handle_tx_event(struct xhci_hcd *xhci,
goto check_endpoint_halted;
}
+ /* TD was queued after xrun, maybe xrun was on a link, don't panic yet */
+ if (ring_xrun_event)
+ return 0;
+
/*
* Skip the Force Stopped Event. The 'ep_trb' of FSE is not in the current
* TD pointed by 'ep_ring->dequeue' because that the hardware dequeue
@@ -2935,6 +2939,10 @@ static int handle_tx_event(struct xhci_hcd *xhci,
*/
} while (ep->skip);
+ /* Get out if a TD was queued at enqueue after the xrun occurred */
+ if (ring_xrun_event)
+ return 0;
+
if (trb_comp_code == COMP_SHORT_PACKET)
ep_ring->last_td_was_short = true;
else
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 4/5] usb: xhci: Expedite skipping missed isoch TDs on modern HCs
2025-02-26 7:02 [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Michal Pecio
` (2 preceding siblings ...)
2025-02-26 7:04 ` [PATCH v3 3/5] usb: xhci: Fix isochronous Ring Underrun/Overrun event handling Michal Pecio
@ 2025-02-26 7:05 ` Michal Pecio
2025-02-26 7:05 ` [PATCH v3 5/5] usb: xhci: Skip only one TD on Ring Underrun/Overrun Michal Pecio
2025-02-26 12:41 ` [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Mathias Nyman
5 siblings, 0 replies; 10+ messages in thread
From: Michal Pecio @ 2025-02-26 7:05 UTC (permalink / raw)
To: Mathias Nyman, Greg Kroah-Hartman; +Cc: Niklas Neronin, linux-usb, linux-kernel
xHCI spec rev. 1.0 allowed the TRB pointer of Missed Service events
to be NULL. Having no idea which of the queued TDs were missed and
which are waiting, we can only set a flag to skip missed TDs later.
But HCs are also allowed to give us pointer to the last missed TRB,
and this became mandatory in spec rev. 1.1 and later.
Use this pointer, if available, to immediately skip all missed TDs.
This reduces latency and risk of skipping-related bugs, because we
can now leave the skip flag cleared for future events.
Handle Missed Service Error events as 'error mid TD', if applicable,
because rev. 1.0 spec excplicitly says so in notes to 4.10.3.2 and
later revs in 4.10.3.2 and 4.11.2.5.2. Notes to 4.9.1 seem to apply.
Tested on ASM1142 and ASM3142 v1.1 xHCs which provide TRB pointers.
Tested on AMD, Etron, Renesas v1.0 xHCs which provide TRB pointers.
Tested on NEC v0.96 and VIA v1.0 xHCs which send a NULL pointer.
Change inspired by a discussion about realtime USB audio.
Link: https://lore.kernel.org/linux-usb/76e1a191-020d-4a76-97f6-237f9bd0ede0@gmx.net/T/
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
---
drivers/usb/host/xhci-ring.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 995f8a9b5b53..ad5f0e439200 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2481,6 +2481,12 @@ static void process_isoc_td(struct xhci_hcd *xhci, struct xhci_virt_ep *ep,
if (ep_trb != td->end_trb)
td->error_mid_td = true;
break;
+ case COMP_MISSED_SERVICE_ERROR:
+ frame->status = -EXDEV;
+ sum_trbs_for_length = true;
+ if (ep_trb != td->end_trb)
+ td->error_mid_td = true;
+ break;
case COMP_INCOMPATIBLE_DEVICE_ERROR:
case COMP_STALL_ERROR:
frame->status = -EPROTO;
@@ -2791,8 +2797,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
*/
ep->skip = true;
xhci_dbg(xhci,
- "Miss service interval error for slot %u ep %u, set skip flag\n",
- slot_id, ep_index);
+ "Miss service interval error for slot %u ep %u, set skip flag%s\n",
+ slot_id, ep_index, ep_trb_dma ? ", skip now" : "");
break;
case COMP_NO_PING_RESPONSE_ERROR:
ep->skip = true;
@@ -2841,8 +2847,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
xhci_dequeue_td(xhci, td, ep_ring, td->status);
}
- /* Missed TDs will be skipped on the next event */
- if (trb_comp_code == COMP_MISSED_SERVICE_ERROR)
+ /* If the TRB pointer is NULL, missed TDs will be skipped on the next event */
+ if (trb_comp_code == COMP_MISSED_SERVICE_ERROR && !ep_trb_dma)
return 0;
if (list_empty(&ep_ring->td_list)) {
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v3 5/5] usb: xhci: Skip only one TD on Ring Underrun/Overrun
2025-02-26 7:02 [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Michal Pecio
` (3 preceding siblings ...)
2025-02-26 7:05 ` [PATCH v3 4/5] usb: xhci: Expedite skipping missed isoch TDs on modern HCs Michal Pecio
@ 2025-02-26 7:05 ` Michal Pecio
2025-02-26 12:41 ` [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Mathias Nyman
5 siblings, 0 replies; 10+ messages in thread
From: Michal Pecio @ 2025-02-26 7:05 UTC (permalink / raw)
To: Mathias Nyman, Greg Kroah-Hartman; +Cc: Niklas Neronin, linux-usb, linux-kernel
If skipping is deferred to events other than Missed Service Error itsef,
it means we are running on an xHCI 1.0 host and don't know how many TDs
were missed until we reach some ordinary transfer completion event.
And in case of ring xrun, we can't know where the xrun happened either.
If we skip all pending TDs, we may prematurely give back TDs added after
the xrun had occurred, risking data loss or buffer UAF by the xHC.
If we skip none, a driver may become confused and stop working when all
its URBs are missed and appear to be "in flight" forever.
Skip exactly one TD on each xrun event - the first one that was missed,
as we can now be sure that the HC has finished processing it. Provided
that one more TD is queued before any subsequent doorbell ring, it will
become safe to skip another TD by the time we get an xrun again.
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
---
drivers/usb/host/xhci-ring.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index ad5f0e439200..2749ebe23a33 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2885,8 +2885,21 @@ static int handle_tx_event(struct xhci_hcd *xhci,
return 0;
skip_isoc_td(xhci, td, ep, status);
- if (!list_empty(&ep_ring->td_list))
+
+ if (!list_empty(&ep_ring->td_list)) {
+ if (ring_xrun_event) {
+ /*
+ * If we are here, we are on xHCI 1.0 host with no
+ * idea how many TDs were missed or where the xrun
+ * occurred. New TDs may have been added after the
+ * xrun, so skip only one TD to be safe.
+ */
+ xhci_dbg(xhci, "Skipped one TD for slot %u ep %u",
+ slot_id, ep_index);
+ return 0;
+ }
continue;
+ }
xhci_dbg(xhci, "All TDs skipped for slot %u ep %u. Clear skip flag.\n",
slot_id, ep_index);
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements
2025-02-26 7:02 [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Michal Pecio
` (4 preceding siblings ...)
2025-02-26 7:05 ` [PATCH v3 5/5] usb: xhci: Skip only one TD on Ring Underrun/Overrun Michal Pecio
@ 2025-02-26 12:41 ` Mathias Nyman
2025-02-26 22:05 ` Michał Pecio
5 siblings, 1 reply; 10+ messages in thread
From: Mathias Nyman @ 2025-02-26 12:41 UTC (permalink / raw)
To: Michal Pecio, Mathias Nyman, Greg Kroah-Hartman
Cc: Niklas Neronin, linux-usb, linux-kernel
On 26.2.2025 9.02, Michal Pecio wrote:
> These patches reduce latency of error reporting in some cases related
> to 'error mid TD' and Missed Service events and sometimes fix a failure
> to give back such TDs altogether until they are cancelled.
>
> Also included are fixes for potential packet loss or memory corruption
> due to obscure races. Whether it causes problems IRL is not known and
> the worst case would be hard to reproduce, but exactly for this reason
> if the worst case actually happens, it could be hard to debug too.
>
> The first three should be safe. The fourth should also be safe, but it
> relies on HC functionality Linux never relied on before, so I placed it
> towards the end in case it would need some tweaks. I tested it on all
> hardware I have and it worked just fine.
>
> The last one is perhaps the most controversial, though it should be
> OK with typical "complete -> resubmit" drivers. It's the only one here
> which increases latency in some severe error cases. The intent is to
> avoid potentially giving back URBs not yet executed by hardware.
>
> New in v3:
> - dropped the cleanup patch
> - added Don't skip on Stopped - Length Invalid
>
> New in v3:
> - fixed spurious empty list warning on xrun
> - clear skip flag if one skipped event was the last
>
> Michal Pecio (5):
> usb: xhci: Don't skip on Stopped - Length Invalid
> usb: xhci: Complete 'error mid TD' transfers when handling Missed
> Service
> usb: xhci: Fix isochronous Ring Underrun/Overrun event handling
> usb: xhci: Expedite skipping missed isoch TDs on modern HCs
> usb: xhci: Skip only one TD on Ring Underrun/Overrun
>
Updated my for-usb-next branch to this v3 version
Thanks
Mathias
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements
2025-02-26 12:41 ` [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Mathias Nyman
@ 2025-02-26 22:05 ` Michał Pecio
2025-02-27 12:10 ` Mathias Nyman
0 siblings, 1 reply; 10+ messages in thread
From: Michał Pecio @ 2025-02-26 22:05 UTC (permalink / raw)
To: Mathias Nyman
Cc: Mathias Nyman, Greg Kroah-Hartman, Niklas Neronin, linux-usb,
linux-kernel
On Wed, 26 Feb 2025 14:41:44 +0200, Mathias Nyman wrote:
> Updated my for-usb-next branch to this v3 version
A few remarks regarding "Add helper to find trb from its dma address":
xhci_dma_to_trb():
This function could use xhci_for_each_ring_seg.
The use of in_range() perhaps deserves a comment, because its
correctness is not as obvious as it might seem.
Long story short, my own version:
/*
* Look up a TRB on the ring by its DMA address, return NULL if not found.
* Start from deq_seg to optimize for event handling use.
*
* Note: false positive is possible if dma < TRB_SEGMENT_SIZE *and*
* seg->dma > (dma_addr_t) 0 - TRB_SEGMENT_SIZE, but that shouldn't
* happen if seg->dma is an allocation of size TRB_SEGMENT_SIZE.
*/
static union xhci_trb *xhci_dma_to_trb(struct xhci_ring *ring, dma_addr_t dma)
{
struct xhci_segment *seg;
xhci_for_each_ring_seg(ring->deq_seg, seg)
if (in_range(dma, seg->dma, TRB_SEGMENT_SIZE))
return seg->trbs + (dma - seg->dma) / sizeof(seg->trbs[0]);
return NULL;
}
>+ struct xhci_td *matched_td;
This variable is only used as bool so it could be declared as such.
Other places still use 'td' and assume that it equals 'matched_td'.
And that's OK because there is no need for iterating further after
the matching TD is found.
>+ /* find the transfer trb this events points to */
>+ if (ep_trb_dma)
>+ ep_trb = xhci_dma_to_trb(ep_ring, ep_trb_dma);
This may deserve a dedicated warning. It's a pathology. Either the
event is bogus due to internal corruption in the HC, or it's executing
TRBs from a wrong ring due to damaged/ignored Link TRB or bad Set Deq.
Or we completely screwed up and are looking at a wrong ep_ring here.
>- if (trb_comp_code == COMP_MISSED_SERVICE_ERROR && !ep_trb_dma)
>+ if (trb_comp_code == COMP_MISSED_SERVICE_ERROR && !ep_trb)
> return 0;
Good idea. I think I would go further and refuse to handle anything
when (ep_trb_dma && !ep_trb). Nothing is going to match, nothing good
will come from trying as far as I see.
But that's a behavior change, so maybe material for a separate patch.
>+ matched_td = NULL;
>
> /* Is this a TRB in the currently executing TD? */
>- ep_seg = trb_in_td(xhci, td, ep_trb_dma, false);
>+ if (trb_in_td(xhci, td, ep_trb_dma, false))
>+ matched_td = td;
If 'matched_td' is declared as bool, this will become simply:
matched_td = trb_in_td(xhci, td, ep_trb_dma, false);
Regards,
Michal
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 1/5] usb: xhci: Don't skip on Stopped - Length Invalid
2025-02-26 7:02 ` [PATCH v3 1/5] usb: xhci: Don't skip on Stopped - Length Invalid Michal Pecio
@ 2025-02-26 22:38 ` Michał Pecio
0 siblings, 0 replies; 10+ messages in thread
From: Michał Pecio @ 2025-02-26 22:38 UTC (permalink / raw)
To: Mathias Nyman, Greg Kroah-Hartman; +Cc: Niklas Neronin, linux-usb, linux-kernel
On Wed, 26 Feb 2025 08:02:55 +0100, Michal Pecio wrote:
> After d56b0b2ab142, TDs are immediately skipped when handling those
> Stopped events. This poses a potential problem in case of Stopped -
> Length Invalid, which occurs either on completed TDs (likely already
> given back) or Link and No-Op TRBs. Such event won't be recognized
> as matching any TD (unless it's the rare Link TRB inside a TD) and
> will result in skipping all pending TDs, giving them back possibly
> before they are done, risking isoc data loss and maybe UAF by HW.
Actually, Stopped and Stopped - Short Packet may be unsafe too.
As far as I understand, one of those (depending on SPC capability)
can occur on the second TRB of a TD whose first TRB completed with
Short Packet. Then the TD is already given back and removed from
td_list, so no match will be found with this Stopped event.
I suspect this is the reason why the driver has a policy to silently
ignore Stopped events which don't match the pending TD, and not only
Stopped - Length Invalid. Not sure why Stopped - Short Packet isn't
also ignored and yet apparently doesn't cause problems.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements
2025-02-26 22:05 ` Michał Pecio
@ 2025-02-27 12:10 ` Mathias Nyman
0 siblings, 0 replies; 10+ messages in thread
From: Mathias Nyman @ 2025-02-27 12:10 UTC (permalink / raw)
To: Michał Pecio
Cc: Mathias Nyman, Greg Kroah-Hartman, Niklas Neronin, linux-usb,
linux-kernel
On 27.2.2025 0.05, Michał Pecio wrote:
> On Wed, 26 Feb 2025 14:41:44 +0200, Mathias Nyman wrote:
>> Updated my for-usb-next branch to this v3 version
>
>
> A few remarks regarding "Add helper to find trb from its dma address":
>
> xhci_dma_to_trb():
> This function could use xhci_for_each_ring_seg.
Good point
> The use of in_range() perhaps deserves a comment, because its
> correctness is not as obvious as it might seem.
>
> Long story short, my own version:
>
> /*
> * Look up a TRB on the ring by its DMA address, return NULL if not found.
> * Start from deq_seg to optimize for event handling use.
> *
> * Note: false positive is possible if dma < TRB_SEGMENT_SIZE *and*
> * seg->dma > (dma_addr_t) 0 - TRB_SEGMENT_SIZE, but that shouldn't
> * happen if seg->dma is an allocation of size TRB_SEGMENT_SIZE.
> */
True, but as you said this shouldn't happen as we allocate TRB_SEGMENT_SIZE bytes
starting at seg->dma, so seg->dma should be at least TRB_SEGMENT_SIZE bytes from
max u64 (or u32)
We can also use "if (dma >= seg->dma && (dma - seg->dma) < TRB_SEGMENT_SIZE)"
instead of in_range(). It's a bit uglier, but we can skip additional notes.
> static union xhci_trb *xhci_dma_to_trb(struct xhci_ring *ring, dma_addr_t dma)
> {
> struct xhci_segment *seg;
>
> xhci_for_each_ring_seg(ring->deq_seg, seg)
> if (in_range(dma, seg->dma, TRB_SEGMENT_SIZE))
> return seg->trbs + (dma - seg->dma) / sizeof(seg->trbs[0]);
>
> return NULL;
> }
>
>> + struct xhci_td *matched_td;
>
> This variable is only used as bool so it could be declared as such.
> Other places still use 'td' and assume that it equals 'matched_td'.
> And that's OK because there is no need for iterating further after
> the matching TD is found.
True, it could be just a bool
>
>> + /* find the transfer trb this events points to */
>> + if (ep_trb_dma)
>> + ep_trb = xhci_dma_to_trb(ep_ring, ep_trb_dma);
>
> This may deserve a dedicated warning. It's a pathology. Either the
> event is bogus due to internal corruption in the HC, or it's executing
> TRBs from a wrong ring due to damaged/ignored Link TRB or bad Set Deq.
> Or we completely screwed up and are looking at a wrong ep_ring here.
>
>> - if (trb_comp_code == COMP_MISSED_SERVICE_ERROR && !ep_trb_dma)
>> + if (trb_comp_code == COMP_MISSED_SERVICE_ERROR && !ep_trb)
>> return 0;
>
> Good idea. I think I would go further and refuse to handle anything
> when (ep_trb_dma && !ep_trb). Nothing is going to match, nothing good
> will come from trying as far as I see.
>
> But that's a behavior change, so maybe material for a separate patch.
Idea of this patch is to slowly migrate handle_tx_event() towards the
vision in my feature_transfer_event_rework branch
https://web.git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=feature_transfer_event_rework
Niklas is working towards a similar goal, and he just informed me that this
patch conflicts a bit with his plan to get there, so I might drop this.
Thanks for looking at it.
-Mathias
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-02-27 12:09 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-26 7:02 [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Michal Pecio
2025-02-26 7:02 ` [PATCH v3 1/5] usb: xhci: Don't skip on Stopped - Length Invalid Michal Pecio
2025-02-26 22:38 ` Michał Pecio
2025-02-26 7:03 ` [PATCH v3 2/5] usb: xhci: Complete 'error mid TD' transfers when handling Missed Service Michal Pecio
2025-02-26 7:04 ` [PATCH v3 3/5] usb: xhci: Fix isochronous Ring Underrun/Overrun event handling Michal Pecio
2025-02-26 7:05 ` [PATCH v3 4/5] usb: xhci: Expedite skipping missed isoch TDs on modern HCs Michal Pecio
2025-02-26 7:05 ` [PATCH v3 5/5] usb: xhci: Skip only one TD on Ring Underrun/Overrun Michal Pecio
2025-02-26 12:41 ` [PATCH v3 0/5] xHCI: Isochronous error handling fixes and improvements Mathias Nyman
2025-02-26 22:05 ` Michał Pecio
2025-02-27 12:10 ` Mathias Nyman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).