linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] xhci: fix matching completion events with TDs
@ 2024-01-23 10:22 Michał Pecio
  2024-01-26  9:59 ` [PATCH v2] " Michal Pecio
  0 siblings, 1 reply; 2+ messages in thread
From: Michał Pecio @ 2024-01-23 10:22 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb

A trb_in_td() call is used to determine if a completion event matches
one of the TRBs of the currently executing TD. This function is given
ep_ring->deq_seg and ep_ring->dequeue as the starting point to search,
but these have been observed to be out of sync with td->start_seg and
td->first_trb of the current TD under some circumstances when the EP
is being stopped.

Consequently, spurious completion events have been observed to become
erroneously associated with an unrelated pending TD.

Since the ring is being searched for the specific purpose of finding
a match with current TD, always start from current TD's first TRB.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
---

Questions:

Is this a good idea? Is it useful? Or am I missing something?

Should a diagnostic be emitted when these pointers don't match? Or when
they don't match AND something else is also wrong?

And yes, I am observing unexpected completion events when stopping an
isochronous endpoint on NEC uPD720200. They used to retire the current
TD and "TRB not in TD" appeared when its true completion arrived, now
they correctly give "TRB not in TD" right away. I consider it progress,
though the origin of spurious completions is still unknown.


 drivers/usb/host/xhci-ring.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 9673354d70d5..d9be5023abe6 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2809,7 +2809,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
 			td_num--;
 
 		/* Is this a TRB in the currently executing TD? */
-		ep_seg = trb_in_td(xhci, ep_ring->deq_seg, ep_ring->dequeue,
+		ep_seg = trb_in_td(xhci, td->start_seg, td->first_trb,
 				td->last_trb, ep_trb_dma, false);
 
 		/*
@@ -2877,9 +2877,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
 					"part of current TD ep_index %d "
 					"comp_code %u\n", ep_index,
 					trb_comp_code);
-				trb_in_td(xhci, ep_ring->deq_seg,
-					  ep_ring->dequeue, td->last_trb,
-					  ep_trb_dma, true);
+				trb_in_td(xhci, td->start_seg, td->first_trb,
+					  td->last_trb, ep_trb_dma, true);
 				return -ESHUTDOWN;
 			}
 		}
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [PATCH v2] xhci: fix matching completion events with TDs
  2024-01-23 10:22 [RFC PATCH] xhci: fix matching completion events with TDs Michał Pecio
@ 2024-01-26  9:59 ` Michal Pecio
  0 siblings, 0 replies; 2+ messages in thread
From: Michal Pecio @ 2024-01-26  9:59 UTC (permalink / raw)
  To: Mathias Nyman; +Cc: linux-usb

A trb_in_td() call is used to determine if a completion event matches
any TRB of the currently executing TD. This function is told to start
searching right after the last finished TD, which is not at all where
the currently expected TD is guaranteed to begin, because some TDs in
between may have been cancelled.

Not only is a pointless work performed, but a bug resulting in the HC
executing cancelled TDs was seen to trick the driver into associating
events from a TD just cancelled with an unrelated future TD.

Since the ring is being traversed for the specific purpose of finding
a match with the current TD, always start from its first TRB. This is
the most reliable bit of information that we posses.

Tracking of HC's work progress is not affected, except for cases when
a misattributed event would have moved dequeue past a pending TD.

Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
---

v2: improved commit message based on new findings

I am now fairly convinced that this is indeed a good idea. Otherwise,
certain event abnormalities develop into several further failures:
- completion of TDs not yet completed can be reported to the core
- ... which may conceivably lead even to DMA-after-free
- ring->dequeue is progressed past a TD not yet released by hardware
- diagnostics are printed only on a later, actually correct event


 drivers/usb/host/xhci-ring.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 9673354d70d5..d9be5023abe6 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -2809,7 +2809,7 @@ static int handle_tx_event(struct xhci_hcd *xhci,
 			td_num--;
 
 		/* Is this a TRB in the currently executing TD? */
-		ep_seg = trb_in_td(xhci, ep_ring->deq_seg, ep_ring->dequeue,
+		ep_seg = trb_in_td(xhci, td->start_seg, td->first_trb,
 				td->last_trb, ep_trb_dma, false);
 
 		/*
@@ -2877,9 +2877,8 @@ static int handle_tx_event(struct xhci_hcd *xhci,
 					"part of current TD ep_index %d "
 					"comp_code %u\n", ep_index,
 					trb_comp_code);
-				trb_in_td(xhci, ep_ring->deq_seg,
-					  ep_ring->dequeue, td->last_trb,
-					  ep_trb_dma, true);
+				trb_in_td(xhci, td->start_seg, td->first_trb,
+					  td->last_trb, ep_trb_dma, true);
 				return -ESHUTDOWN;
 			}
 		}
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-01-26 10:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-23 10:22 [RFC PATCH] xhci: fix matching completion events with TDs Michał Pecio
2024-01-26  9:59 ` [PATCH v2] " Michal Pecio

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).