* [PATCH 0/8] USB fixes: xHCI error handling
@ 2023-10-26 23:16 Hector Martin
2023-10-26 23:16 ` [PATCH 1/8] usb: xhci: Guard all calls to xhci_wait_for_event Hector Martin
` (9 more replies)
0 siblings, 10 replies; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
This series is the first of a few bundles of USB fixes we have been
carrying downstream on the Asahi U-Boot branch for a few months.
Most importantly, this related set of patches makes xHCI error/stall
recovery more robust (or work at all in some cases). There are also a
couple patches fixing other xHCI bugs and adding better debug logs.
I believe this should fix this Fedora bug too:
https://bugzilla.redhat.com/show_bug.cgi?id=2244305
Signed-off-by: Hector Martin <marcan@marcan.st>
---
Hector Martin (8):
usb: xhci: Guard all calls to xhci_wait_for_event
usb: xhci: Better error handling in abort_td()
usb: xhci: Allow context state errors when halting an endpoint
usb: xhci: Recover from halted non-control endpoints
usb: xhci: Fail on attempt to queue TRBs to a halted endpoint
usb: xhci: Do not panic on event timeouts
usb: xhci: Fix DMA address calculation in queue_trb
usb: xhci: Add more debugging
drivers/usb/host/xhci-ring.c | 100 +++++++++++++++++++++++++++++++++++--------
drivers/usb/host/xhci.c | 9 ++++
include/usb/xhci.h | 2 +
3 files changed, 92 insertions(+), 19 deletions(-)
---
base-commit: fb428b61819444b9337075f49c72f326f5d12085
change-id: 20231027-usb-fixes-1-83bfc7013012
Best regards,
--
Hector Martin <marcan@marcan.st>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/8] usb: xhci: Guard all calls to xhci_wait_for_event
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
@ 2023-10-26 23:16 ` Hector Martin
2023-10-26 23:26 ` [PATCH] fixup! " Hector Martin
2023-10-26 23:16 ` [PATCH 2/8] usb: xhci: Better error handling in abort_td() Hector Martin
` (8 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
xhci_wait_for_event returns NULL on timeout, so the caller always has to
check for that. This addresses the immediate explosions in this part
of the code, but not the original cause.
Signed-off-by: Hector Martin <marcan@marcan.st>
---
drivers/usb/host/xhci-ring.c | 15 ++++++++++++++-
drivers/usb/host/xhci.c | 9 +++++++++
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index c8260cbdf94b..aaf128ff9317 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -511,7 +511,8 @@ static void reset_ep(struct usb_device *udev, int ep_index)
printf("Resetting EP %d...\n", ep_index);
xhci_queue_command(ctrl, 0, udev->slot_id, ep_index, TRB_RESET_EP);
event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
- field = le32_to_cpu(event->trans_event.flags);
+ if (!event)
+ return;
BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
xhci_acknowledge_event(ctrl);
@@ -519,6 +520,9 @@ static void reset_ep(struct usb_device *udev, int ep_index)
(void *)((uintptr_t)ring->enqueue | ring->cycle_state));
xhci_queue_command(ctrl, addr, udev->slot_id, ep_index, TRB_SET_DEQ);
event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
+ if (!event)
+ return;
+
BUG_ON(TRB_TO_SLOT_ID(le32_to_cpu(event->event_cmd.flags))
!= udev->slot_id || GET_COMP_CODE(le32_to_cpu(
event->event_cmd.status)) != COMP_SUCCESS);
@@ -544,6 +548,9 @@ static void abort_td(struct usb_device *udev, int ep_index)
xhci_queue_command(ctrl, 0, udev->slot_id, ep_index, TRB_STOP_RING);
event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
+ if (!event)
+ return;
+
field = le32_to_cpu(event->trans_event.flags);
BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
@@ -552,6 +559,9 @@ static void abort_td(struct usb_device *udev, int ep_index)
xhci_acknowledge_event(ctrl);
event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
+ if (!event)
+ return;
+
BUG_ON(TRB_TO_SLOT_ID(le32_to_cpu(event->event_cmd.flags))
!= udev->slot_id || GET_COMP_CODE(le32_to_cpu(
event->event_cmd.status)) != COMP_SUCCESS);
@@ -561,6 +571,9 @@ static void abort_td(struct usb_device *udev, int ep_index)
(void *)((uintptr_t)ring->enqueue | ring->cycle_state));
xhci_queue_command(ctrl, addr, udev->slot_id, ep_index, TRB_SET_DEQ);
event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
+ if (!event)
+ return;
+
BUG_ON(TRB_TO_SLOT_ID(le32_to_cpu(event->event_cmd.flags))
!= udev->slot_id || GET_COMP_CODE(le32_to_cpu(
event->event_cmd.status)) != COMP_SUCCESS);
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 5cacf0769ec7..d13cbff9b372 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -451,6 +451,9 @@ static int xhci_configure_endpoints(struct usb_device *udev, bool ctx_change)
xhci_queue_command(ctrl, in_ctx->dma, udev->slot_id, 0,
ctx_change ? TRB_EVAL_CONTEXT : TRB_CONFIG_EP);
event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
+ if (!event)
+ return -ETIMEDOUT;
+
BUG_ON(TRB_TO_SLOT_ID(le32_to_cpu(event->event_cmd.flags))
!= udev->slot_id);
@@ -647,6 +650,9 @@ static int xhci_address_device(struct usb_device *udev, int root_portnr)
xhci_queue_command(ctrl, virt_dev->in_ctx->dma,
slot_id, 0, TRB_ADDR_DEV);
event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
+ if (!event)
+ return -ETIMEDOUT;
+
BUG_ON(TRB_TO_SLOT_ID(le32_to_cpu(event->event_cmd.flags)) != slot_id);
switch (GET_COMP_CODE(le32_to_cpu(event->event_cmd.status))) {
@@ -722,6 +728,9 @@ static int _xhci_alloc_device(struct usb_device *udev)
xhci_queue_command(ctrl, 0, 0, 0, TRB_ENABLE_SLOT);
event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
+ if (!event)
+ return -ETIMEDOUT;
+
BUG_ON(GET_COMP_CODE(le32_to_cpu(event->event_cmd.status))
!= COMP_SUCCESS);
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 2/8] usb: xhci: Better error handling in abort_td()
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
2023-10-26 23:16 ` [PATCH 1/8] usb: xhci: Guard all calls to xhci_wait_for_event Hector Martin
@ 2023-10-26 23:16 ` Hector Martin
2023-10-27 0:52 ` Marek Vasut
2023-10-26 23:16 ` [PATCH 3/8] usb: xhci: Allow context state errors when halting an endpoint Hector Martin
` (7 subsequent siblings)
9 siblings, 1 reply; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
If the xHC has a problem with our STOP ENDPOINT command, it is likely to
return a completion directly instead of first a transfer event for the
in-progress transfer. Handle that more gracefully.
Right now we still BUG() on the error code, but at least we don't end up
timing out on the event and ending up with unexpected event errors.
Signed-off-by: Hector Martin <marcan@marcan.st>
---
drivers/usb/host/xhci-ring.c | 34 ++++++++++++++++++++++------------
include/usb/xhci.h | 2 ++
2 files changed, 24 insertions(+), 12 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index aaf128ff9317..d08bb8e2bfba 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -466,7 +466,8 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected)
continue;
type = TRB_FIELD_TO_TYPE(le32_to_cpu(event->event_cmd.flags));
- if (type == expected)
+ if (type == expected ||
+ (expected == TRB_NONE && type != TRB_PORT_STATUS))
return event;
if (type == TRB_PORT_STATUS)
@@ -542,27 +543,36 @@ static void abort_td(struct usb_device *udev, int ep_index)
struct xhci_ctrl *ctrl = xhci_get_ctrl(udev);
struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring;
union xhci_trb *event;
+ trb_type type;
u64 addr;
u32 field;
xhci_queue_command(ctrl, 0, udev->slot_id, ep_index, TRB_STOP_RING);
- event = xhci_wait_for_event(ctrl, TRB_TRANSFER);
+ event = xhci_wait_for_event(ctrl, TRB_NONE);
if (!event)
return;
- field = le32_to_cpu(event->trans_event.flags);
- BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
- BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
- BUG_ON(GET_COMP_CODE(le32_to_cpu(event->trans_event.transfer_len
- != COMP_STOP)));
- xhci_acknowledge_event(ctrl);
+ type = TRB_FIELD_TO_TYPE(le32_to_cpu(event->event_cmd.flags));
+ if (type == TRB_TRANSFER) {
+ field = le32_to_cpu(event->trans_event.flags);
+ BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
+ BUG_ON(TRB_TO_EP_INDEX(field) != ep_index);
+ BUG_ON(GET_COMP_CODE(le32_to_cpu(event->trans_event.transfer_len
+ != COMP_STOP)));
+ xhci_acknowledge_event(ctrl);
- event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
- if (!event)
- return;
+ event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
+ if (!event)
+ return;
+ type = TRB_FIELD_TO_TYPE(le32_to_cpu(event->event_cmd.flags));
- BUG_ON(TRB_TO_SLOT_ID(le32_to_cpu(event->event_cmd.flags))
+ } else {
+ printf("abort_td: Expected a TRB_TRANSFER TRB first\n");
+ }
+
+ BUG_ON(type != TRB_COMPLETION ||
+ TRB_TO_SLOT_ID(le32_to_cpu(event->event_cmd.flags))
!= udev->slot_id || GET_COMP_CODE(le32_to_cpu(
event->event_cmd.status)) != COMP_SUCCESS);
xhci_acknowledge_event(ctrl);
diff --git a/include/usb/xhci.h b/include/usb/xhci.h
index 4a4ac10229ac..04d16a256bbd 100644
--- a/include/usb/xhci.h
+++ b/include/usb/xhci.h
@@ -901,6 +901,8 @@ union xhci_trb {
/* TRB type IDs */
typedef enum {
+ /* reserved, used as a software sentinel */
+ TRB_NONE = 0,
/* bulk, interrupt, isoc scatter/gather, and control data stage */
TRB_NORMAL = 1,
/* setup stage for control transfers */
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 3/8] usb: xhci: Allow context state errors when halting an endpoint
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
2023-10-26 23:16 ` [PATCH 1/8] usb: xhci: Guard all calls to xhci_wait_for_event Hector Martin
2023-10-26 23:16 ` [PATCH 2/8] usb: xhci: Better error handling in abort_td() Hector Martin
@ 2023-10-26 23:16 ` Hector Martin
2023-10-26 23:16 ` [PATCH 4/8] usb: xhci: Recover from halted non-control endpoints Hector Martin
` (6 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
There is a race where an endpoint may halt by itself while we are trying
to halt it, which results in a context state error. See xHCI 4.6.9 which
mentions this case.
This also avoids BUGging when we attempt to stop an endpoint which was
already stopped to begin with, which is probably a bug elsewhere but
not a good reason to crash.
Signed-off-by: Hector Martin <marcan@marcan.st>
---
drivers/usb/host/xhci-ring.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index d08bb8e2bfba..5c2d16b58589 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -543,6 +543,7 @@ static void abort_td(struct usb_device *udev, int ep_index)
struct xhci_ctrl *ctrl = xhci_get_ctrl(udev);
struct xhci_ring *ring = ctrl->devs[udev->slot_id]->eps[ep_index].ring;
union xhci_trb *event;
+ xhci_comp_code comp;
trb_type type;
u64 addr;
u32 field;
@@ -571,10 +572,11 @@ static void abort_td(struct usb_device *udev, int ep_index)
printf("abort_td: Expected a TRB_TRANSFER TRB first\n");
}
+ comp = GET_COMP_CODE(le32_to_cpu(event->event_cmd.status));
BUG_ON(type != TRB_COMPLETION ||
TRB_TO_SLOT_ID(le32_to_cpu(event->event_cmd.flags))
- != udev->slot_id || GET_COMP_CODE(le32_to_cpu(
- event->event_cmd.status)) != COMP_SUCCESS);
+ != udev->slot_id || (comp != COMP_SUCCESS && comp
+ != COMP_CTX_STATE));
xhci_acknowledge_event(ctrl);
addr = xhci_trb_virt_to_dma(ring->enq_seg,
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 4/8] usb: xhci: Recover from halted non-control endpoints
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
` (2 preceding siblings ...)
2023-10-26 23:16 ` [PATCH 3/8] usb: xhci: Allow context state errors when halting an endpoint Hector Martin
@ 2023-10-26 23:16 ` Hector Martin
2023-10-26 23:16 ` [PATCH 5/8] usb: xhci: Fail on attempt to queue TRBs to a halted endpoint Hector Martin
` (5 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
There is currently no codepath to recover from this case. In principle
we could require that the upper layer do this explicitly, but let's just
do it in xHCI when the next bulk transfer is started, since that
reasonably implies whatever caused the problem has been dealt with.
Signed-off-by: Hector Martin <marcan@marcan.st>
---
drivers/usb/host/xhci-ring.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 5c2d16b58589..60f2cf72dffa 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -669,6 +669,14 @@ int xhci_bulk_tx(struct usb_device *udev, unsigned long pipe,
ep_ctx = xhci_get_ep_ctx(ctrl, virt_dev->out_ctx, ep_index);
+ /*
+ * If the endpoint was halted due to a prior error, resume it before
+ * the next transfer. It is the responsibility of the upper layer to
+ * have dealt with whatever caused the error.
+ */
+ if ((le32_to_cpu(ep_ctx->ep_info) & EP_STATE_MASK) == EP_STATE_HALTED)
+ reset_ep(udev, ep_index);
+
ring = virt_dev->eps[ep_index].ring;
/*
* How much data is (potentially) left before the 64KB boundary?
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 5/8] usb: xhci: Fail on attempt to queue TRBs to a halted endpoint
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
` (3 preceding siblings ...)
2023-10-26 23:16 ` [PATCH 4/8] usb: xhci: Recover from halted non-control endpoints Hector Martin
@ 2023-10-26 23:16 ` Hector Martin
2023-10-26 23:16 ` [PATCH 6/8] usb: xhci: Do not panic on event timeouts Hector Martin
` (4 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
This isn't going to work, don't pretend it will and then end up timing
out.
Signed-off-by: Hector Martin <marcan@marcan.st>
---
drivers/usb/host/xhci-ring.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 60f2cf72dffa..3ef8c2f14ccc 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -243,7 +243,8 @@ static int prepare_ring(struct xhci_ctrl *ctrl, struct xhci_ring *ep_ring,
puts("WARN waiting for error on ep to be cleared\n");
return -EINVAL;
case EP_STATE_HALTED:
- puts("WARN halted endpoint, queueing URB anyway.\n");
+ puts("WARN endpoint is halted\n");
+ return -EINVAL;
case EP_STATE_STOPPED:
case EP_STATE_RUNNING:
debug("EP STATE RUNNING.\n");
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 6/8] usb: xhci: Do not panic on event timeouts
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
` (4 preceding siblings ...)
2023-10-26 23:16 ` [PATCH 5/8] usb: xhci: Fail on attempt to queue TRBs to a halted endpoint Hector Martin
@ 2023-10-26 23:16 ` Hector Martin
2023-10-26 23:16 ` [PATCH 7/8] usb: xhci: Fix DMA address calculation in queue_trb Hector Martin
` (3 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
Now that we always check the return value, just return NULL on timeouts.
We can still log the error since this is a problem, but it's not reason
to panic.
Signed-off-by: Hector Martin <marcan@marcan.st>
---
drivers/usb/host/xhci-ring.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 3ef8c2f14ccc..b95f20642943 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -494,8 +494,9 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected)
if (expected == TRB_TRANSFER)
return NULL;
- printf("XHCI timeout on event type %d... cannot recover.\n", expected);
- BUG();
+ printf("XHCI timeout on event type %d...\n", expected);
+
+ return NULL;
}
/*
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 7/8] usb: xhci: Fix DMA address calculation in queue_trb
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
` (5 preceding siblings ...)
2023-10-26 23:16 ` [PATCH 6/8] usb: xhci: Do not panic on event timeouts Hector Martin
@ 2023-10-26 23:16 ` Hector Martin
2023-10-26 23:16 ` [PATCH 8/8] usb: xhci: Add more debugging Hector Martin
` (2 subsequent siblings)
9 siblings, 0 replies; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
We need to get the DMA address before incrementing the pointer, as that
might move us onto another segment.
Signed-off-by: Hector Martin <marcan@marcan.st>
---
drivers/usb/host/xhci-ring.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index b95f20642943..b9518e4a6743 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -202,18 +202,22 @@ static dma_addr_t queue_trb(struct xhci_ctrl *ctrl, struct xhci_ring *ring,
bool more_trbs_coming, unsigned int *trb_fields)
{
struct xhci_generic_trb *trb;
+ dma_addr_t addr;
int i;
trb = &ring->enqueue->generic;
+
for (i = 0; i < 4; i++)
trb->field[i] = cpu_to_le32(trb_fields[i]);
xhci_flush_cache((uintptr_t)trb, sizeof(struct xhci_generic_trb));
+ addr = xhci_trb_virt_to_dma(ring->enq_seg, (union xhci_trb *)trb);
+
inc_enq(ctrl, ring, more_trbs_coming);
- return xhci_trb_virt_to_dma(ring->enq_seg, (union xhci_trb *)trb);
+ return addr;
}
/**
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 8/8] usb: xhci: Add more debugging
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
` (6 preceding siblings ...)
2023-10-26 23:16 ` [PATCH 7/8] usb: xhci: Fix DMA address calculation in queue_trb Hector Martin
@ 2023-10-26 23:16 ` Hector Martin
2023-10-27 18:58 ` [PATCH 0/8] USB fixes: xHCI error handling Neal Gompa
2023-11-19 20:08 ` Marek Vasut
9 siblings, 0 replies; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:16 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
A bunch of miscellaneous debug messages to aid in working out USB
issues.
Signed-off-by: Hector Martin <marcan@marcan.st>
---
drivers/usb/host/xhci-ring.c | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index b9518e4a6743..e2bd2e999a8e 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -215,6 +215,9 @@ static dma_addr_t queue_trb(struct xhci_ctrl *ctrl, struct xhci_ring *ring,
addr = xhci_trb_virt_to_dma(ring->enq_seg, (union xhci_trb *)trb);
+ debug("trb @ %llx: %08x %08x %08x %08x\n", addr,
+ trb_fields[0], trb_fields[1], trb_fields[2], trb_fields[3]);
+
inc_enq(ctrl, ring, more_trbs_coming);
return addr;
@@ -297,6 +300,8 @@ void xhci_queue_command(struct xhci_ctrl *ctrl, dma_addr_t addr, u32 slot_id,
{
u32 fields[4];
+ debug("CMD: %llx 0x%x 0x%x %d\n", addr, slot_id, ep_index, cmd);
+
BUG_ON(prepare_ring(ctrl, ctrl->cmd_ring, EP_STATE_RUNNING));
fields[0] = lower_32_bits(addr);
@@ -472,8 +477,14 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected)
type = TRB_FIELD_TO_TYPE(le32_to_cpu(event->event_cmd.flags));
if (type == expected ||
- (expected == TRB_NONE && type != TRB_PORT_STATUS))
+ (expected == TRB_NONE && type != TRB_PORT_STATUS)) {
+ debug("Event: %08x %08x %08x %08x\n",
+ le32_to_cpu(event->generic.field[0]),
+ le32_to_cpu(event->generic.field[1]),
+ le32_to_cpu(event->generic.field[2]),
+ le32_to_cpu(event->generic.field[3]));
return event;
+ }
if (type == TRB_PORT_STATUS)
/* TODO: remove this once enumeration has been reworked */
@@ -485,8 +496,9 @@ union xhci_trb *xhci_wait_for_event(struct xhci_ctrl *ctrl, trb_type expected)
le32_to_cpu(event->generic.field[2])) !=
COMP_SUCCESS);
else
- printf("Unexpected XHCI event TRB, skipping... "
+ printf("Unexpected XHCI event TRB, expected %d... "
"(%08x %08x %08x %08x)\n",
+ expected,
le32_to_cpu(event->generic.field[0]),
le32_to_cpu(event->generic.field[1]),
le32_to_cpu(event->generic.field[2]),
@@ -601,10 +613,13 @@ static void abort_td(struct usb_device *udev, int ep_index)
static void record_transfer_result(struct usb_device *udev,
union xhci_trb *event, int length)
{
+ xhci_comp_code code = GET_COMP_CODE(
+ le32_to_cpu(event->trans_event.transfer_len));
+
udev->act_len = min(length, length -
(int)EVENT_TRB_LEN(le32_to_cpu(event->trans_event.transfer_len)));
- switch (GET_COMP_CODE(le32_to_cpu(event->trans_event.transfer_len))) {
+ switch (code) {
case COMP_SUCCESS:
BUG_ON(udev->act_len != length);
/* fallthrough */
@@ -612,16 +627,23 @@ static void record_transfer_result(struct usb_device *udev,
udev->status = 0;
break;
case COMP_STALL:
+ debug("Xfer STALL\n");
udev->status = USB_ST_STALLED;
break;
case COMP_DB_ERR:
+ debug("Xfer DB_ERR\n");
+ udev->status = USB_ST_BUF_ERR;
+ break;
case COMP_TRB_ERR:
+ debug("Xfer TRB_ERR\n");
udev->status = USB_ST_BUF_ERR;
break;
case COMP_BABBLE:
+ debug("Xfer BABBLE\n");
udev->status = USB_ST_BABBLE_DET;
break;
default:
+ debug("Xfer error: %d\n", code);
udev->status = 0x80; /* USB_ST_TOO_LAZY_TO_MAKE_A_NEW_MACRO */
}
}
@@ -1015,6 +1037,7 @@ int xhci_ctrl_tx(struct usb_device *udev, unsigned long pipe,
record_transfer_result(udev, event, length);
xhci_acknowledge_event(ctrl);
if (udev->status == USB_ST_STALLED) {
+ debug("EP %d stalled\n", ep_index);
reset_ep(udev, ep_index);
return -EPIPE;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH] fixup! usb: xhci: Guard all calls to xhci_wait_for_event
2023-10-26 23:16 ` [PATCH 1/8] usb: xhci: Guard all calls to xhci_wait_for_event Hector Martin
@ 2023-10-26 23:26 ` Hector Martin
2023-10-27 0:36 ` Marek Vasut
0 siblings, 1 reply; 25+ messages in thread
From: Hector Martin @ 2023-10-26 23:26 UTC (permalink / raw)
To: Bin Meng, Marek Vasut; +Cc: Mark Kettenis, u-boot, asahi, Hector Martin
Gah, I should've paid more attention to that rebase. Here's a dumb
fixup for this patch. I'll squash it into a v2 if needed alongside
any other changes, or if it looks good feel free to apply/squash
it in directly.
---
drivers/usb/host/xhci-ring.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index e2bd2e999a8e..7f2507be0cf0 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -532,6 +532,7 @@ static void reset_ep(struct usb_device *udev, int ep_index)
event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
if (!event)
return;
+ field = le32_to_cpu(event->trans_event.flags);
BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
xhci_acknowledge_event(ctrl);
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH] fixup! usb: xhci: Guard all calls to xhci_wait_for_event
2023-10-26 23:26 ` [PATCH] fixup! " Hector Martin
@ 2023-10-27 0:36 ` Marek Vasut
2023-10-27 6:03 ` Hector Martin
0 siblings, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2023-10-27 0:36 UTC (permalink / raw)
To: Hector Martin, Bin Meng; +Cc: Mark Kettenis, u-boot, asahi, Minda Chen
On 10/27/23 01:26, Hector Martin wrote:
> Gah, I should've paid more attention to that rebase. Here's a dumb
> fixup for this patch. I'll squash it into a v2 if needed alongside
> any other changes, or if it looks good feel free to apply/squash
> it in directly.
>
> ---
> drivers/usb/host/xhci-ring.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index e2bd2e999a8e..7f2507be0cf0 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -532,6 +532,7 @@ static void reset_ep(struct usb_device *udev, int ep_index)
> event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
> if (!event)
> return;
> + field = le32_to_cpu(event->trans_event.flags);
> BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
> xhci_acknowledge_event(ctrl);
Please squash, and add
Reviewed-by: Marek Vasut <marex@denx.de>
Also, +CC Minda,
there has been a similar fix to this one but with much more information
about the failure, see:
[PATCH v1] usb: xhci: Check return value of wait for TRB_TRANSFER event
I think it would be useful to somehow include that information, so it
wouldn't be lost.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/8] usb: xhci: Better error handling in abort_td()
2023-10-26 23:16 ` [PATCH 2/8] usb: xhci: Better error handling in abort_td() Hector Martin
@ 2023-10-27 0:52 ` Marek Vasut
0 siblings, 0 replies; 25+ messages in thread
From: Marek Vasut @ 2023-10-27 0:52 UTC (permalink / raw)
To: Hector Martin, Bin Meng; +Cc: Mark Kettenis, u-boot, asahi
On 10/27/23 01:16, Hector Martin wrote:
> If the xHC has a problem with our STOP ENDPOINT command, it is likely to
> return a completion directly instead of first a transfer event for the
> in-progress transfer. Handle that more gracefully.
>
> Right now we still BUG() on the error code, but at least we don't end up
> timing out on the event and ending up with unexpected event errors.
>
> Signed-off-by: Hector Martin <marcan@marcan.st>
I'll defer the rest of the review to Bin , he just knows better when it
comes to xHCI . Please ping me if things are stuck for too long however.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] fixup! usb: xhci: Guard all calls to xhci_wait_for_event
2023-10-27 0:36 ` Marek Vasut
@ 2023-10-27 6:03 ` Hector Martin
2023-10-29 21:25 ` Marek Vasut
0 siblings, 1 reply; 25+ messages in thread
From: Hector Martin @ 2023-10-27 6:03 UTC (permalink / raw)
To: Marek Vasut, Bin Meng; +Cc: Mark Kettenis, u-boot, asahi, Minda Chen
On 27/10/2023 09.36, Marek Vasut wrote:
> On 10/27/23 01:26, Hector Martin wrote:
>> Gah, I should've paid more attention to that rebase. Here's a dumb
>> fixup for this patch. I'll squash it into a v2 if needed alongside
>> any other changes, or if it looks good feel free to apply/squash
>> it in directly.
>>
>> ---
>> drivers/usb/host/xhci-ring.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>> index e2bd2e999a8e..7f2507be0cf0 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -532,6 +532,7 @@ static void reset_ep(struct usb_device *udev, int ep_index)
>> event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
>> if (!event)
>> return;
>> + field = le32_to_cpu(event->trans_event.flags);
>> BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
>> xhci_acknowledge_event(ctrl);
>
> Please squash, and add
>
> Reviewed-by: Marek Vasut <marex@denx.de>
>
> Also, +CC Minda,
>
> there has been a similar fix to this one but with much more information
> about the failure, see:
>
> [PATCH v1] usb: xhci: Check return value of wait for TRB_TRANSFER event
>
> I think it would be useful to somehow include that information, so it
> wouldn't be lost.
The root cause for *that* failure is what I fix in patch #5. From that
thread:
> scanning bus xhci_pci for devices... WARN halted endpoint, queueing
URB anyway.
Without #5 and without #1, that situation then leads to fireworks.
A bunch of things are broken, which is why this is a series and not a
single patch. I have more fixes to timeout handling, mass storage, etc.
coming up after this. I fixed most of this in one long day of trying
random USB devices, so it's not so much subtle super specific problems I
could document the crashes for as just wide-ranging problems in the
u-boot USB stack. None of this is particularly hard to repro if you just
try a bunch of normal consumer USB devices, including things like USB
HDDs which take time to spin-up leading to timeouts etc. The crash dumps
aren't particularly useful, it's the subtleties of the xHCI interactions
that are, but for that you need to add and enable a lot more debugging
(patch #8).
I think the main reason all this stuff is broken and we're finding out
now is that u-boot has traditionally been used in embedded devices with
fairly narrow use cases for USB, and now we're running it on
workstation-grade laptops and desktops and people are plugging in all
kinds of normal USB devices and expecting their bootloader not to freak
out with them (even though most of the time it doesn't need to talk to
them). It's really easy to run into this stuff when that's your use case.
- Hector
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
` (7 preceding siblings ...)
2023-10-26 23:16 ` [PATCH 8/8] usb: xhci: Add more debugging Hector Martin
@ 2023-10-27 18:58 ` Neal Gompa
2023-10-27 19:42 ` Neal Gompa
2023-11-19 20:08 ` Marek Vasut
9 siblings, 1 reply; 25+ messages in thread
From: Neal Gompa @ 2023-10-27 18:58 UTC (permalink / raw)
To: Hector Martin; +Cc: Bin Meng, Marek Vasut, Mark Kettenis, u-boot, asahi
On Thu, Oct 26, 2023 at 7:26 PM Hector Martin <marcan@marcan.st> wrote:
>
> This series is the first of a few bundles of USB fixes we have been
> carrying downstream on the Asahi U-Boot branch for a few months.
>
> Most importantly, this related set of patches makes xHCI error/stall
> recovery more robust (or work at all in some cases). There are also a
> couple patches fixing other xHCI bugs and adding better debug logs.
>
> I believe this should fix this Fedora bug too:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=2244305
>
> Signed-off-by: Hector Martin <marcan@marcan.st>
> ---
> Hector Martin (8):
> usb: xhci: Guard all calls to xhci_wait_for_event
> usb: xhci: Better error handling in abort_td()
> usb: xhci: Allow context state errors when halting an endpoint
> usb: xhci: Recover from halted non-control endpoints
> usb: xhci: Fail on attempt to queue TRBs to a halted endpoint
> usb: xhci: Do not panic on event timeouts
> usb: xhci: Fix DMA address calculation in queue_trb
> usb: xhci: Add more debugging
>
> drivers/usb/host/xhci-ring.c | 100 +++++++++++++++++++++++++++++++++++--------
> drivers/usb/host/xhci.c | 9 ++++
> include/usb/xhci.h | 2 +
> 3 files changed, 92 insertions(+), 19 deletions(-)
> ---
> base-commit: fb428b61819444b9337075f49c72f326f5d12085
> change-id: 20231027-usb-fixes-1-83bfc7013012
>
The series looks reasonable and has worked quite well in Fedora Asahi.
Reviewed-by: Neal Gompa <neal@gompa.dev>
--
真実はいつも一つ!/ Always, there's only one truth!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-10-27 18:58 ` [PATCH 0/8] USB fixes: xHCI error handling Neal Gompa
@ 2023-10-27 19:42 ` Neal Gompa
0 siblings, 0 replies; 25+ messages in thread
From: Neal Gompa @ 2023-10-27 19:42 UTC (permalink / raw)
To: Hector Martin; +Cc: Bin Meng, Marek Vasut, Mark Kettenis, u-boot, asahi
On Fri, Oct 27, 2023 at 2:58 PM Neal Gompa <neal@gompa.dev> wrote:
>
> On Thu, Oct 26, 2023 at 7:26 PM Hector Martin <marcan@marcan.st> wrote:
> >
> > This series is the first of a few bundles of USB fixes we have been
> > carrying downstream on the Asahi U-Boot branch for a few months.
> >
> > Most importantly, this related set of patches makes xHCI error/stall
> > recovery more robust (or work at all in some cases). There are also a
> > couple patches fixing other xHCI bugs and adding better debug logs.
> >
> > I believe this should fix this Fedora bug too:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=2244305
> >
> > Signed-off-by: Hector Martin <marcan@marcan.st>
> > ---
> > Hector Martin (8):
> > usb: xhci: Guard all calls to xhci_wait_for_event
> > usb: xhci: Better error handling in abort_td()
> > usb: xhci: Allow context state errors when halting an endpoint
> > usb: xhci: Recover from halted non-control endpoints
> > usb: xhci: Fail on attempt to queue TRBs to a halted endpoint
> > usb: xhci: Do not panic on event timeouts
> > usb: xhci: Fix DMA address calculation in queue_trb
> > usb: xhci: Add more debugging
> >
> > drivers/usb/host/xhci-ring.c | 100 +++++++++++++++++++++++++++++++++++--------
> > drivers/usb/host/xhci.c | 9 ++++
> > include/usb/xhci.h | 2 +
> > 3 files changed, 92 insertions(+), 19 deletions(-)
> > ---
> > base-commit: fb428b61819444b9337075f49c72f326f5d12085
> > change-id: 20231027-usb-fixes-1-83bfc7013012
> >
>
> The series looks reasonable and has worked quite well in Fedora Asahi.
>
> Reviewed-by: Neal Gompa <neal@gompa.dev>
>
Resending now that I'm subscribed to the U-Boot mailing list...
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
--
真実はいつも一つ!/ Always, there's only one truth!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] fixup! usb: xhci: Guard all calls to xhci_wait_for_event
2023-10-27 6:03 ` Hector Martin
@ 2023-10-29 21:25 ` Marek Vasut
2023-10-29 21:33 ` Peter Robinson
0 siblings, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2023-10-29 21:25 UTC (permalink / raw)
To: Hector Martin, Bin Meng; +Cc: Mark Kettenis, u-boot, asahi, Minda Chen
On 10/27/23 08:03, Hector Martin wrote:
> On 27/10/2023 09.36, Marek Vasut wrote:
>> On 10/27/23 01:26, Hector Martin wrote:
>>> Gah, I should've paid more attention to that rebase. Here's a dumb
>>> fixup for this patch. I'll squash it into a v2 if needed alongside
>>> any other changes, or if it looks good feel free to apply/squash
>>> it in directly.
>>>
>>> ---
>>> drivers/usb/host/xhci-ring.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>>> index e2bd2e999a8e..7f2507be0cf0 100644
>>> --- a/drivers/usb/host/xhci-ring.c
>>> +++ b/drivers/usb/host/xhci-ring.c
>>> @@ -532,6 +532,7 @@ static void reset_ep(struct usb_device *udev, int ep_index)
>>> event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
>>> if (!event)
>>> return;
>>> + field = le32_to_cpu(event->trans_event.flags);
>>> BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
>>> xhci_acknowledge_event(ctrl);
>>
>> Please squash, and add
>>
>> Reviewed-by: Marek Vasut <marex@denx.de>
>>
>> Also, +CC Minda,
>>
>> there has been a similar fix to this one but with much more information
>> about the failure, see:
>>
>> [PATCH v1] usb: xhci: Check return value of wait for TRB_TRANSFER event
>>
>> I think it would be useful to somehow include that information, so it
>> wouldn't be lost.
>
> The root cause for *that* failure is what I fix in patch #5. From that
> thread:
>
>> scanning bus xhci_pci for devices... WARN halted endpoint, queueing
> URB anyway.
>
> Without #5 and without #1, that situation then leads to fireworks.
>
> A bunch of things are broken, which is why this is a series and not a
> single patch. I have more fixes to timeout handling, mass storage, etc.
> coming up after this. I fixed most of this in one long day of trying
> random USB devices, so it's not so much subtle super specific problems I
> could document the crashes for as just wide-ranging problems in the
> u-boot USB stack. None of this is particularly hard to repro if you just
> try a bunch of normal consumer USB devices, including things like USB
> HDDs which take time to spin-up leading to timeouts etc.
I think majority of users I can think of use USB mass storage devices,
like USB pen drives, not so much HDDs as far as I can tell.
> The crash dumps
> aren't particularly useful, it's the subtleties of the xHCI interactions
> that are, but for that you need to add and enable a lot more debugging
> (patch #8).
The crash dumps are more for posterity, when someone searches for a
problem they see. Search engines tend to pick those up and it might
point those people in the right direction.
Also, it is good to include information about what triggered the crash,
e.g. which USB device on which hardware, in case this is needed in the
future.
> I think the main reason all this stuff is broken and we're finding out
> now is that u-boot has traditionally been used in embedded devices with
> fairly narrow use cases for USB
Yes
>, and now we're running it on
> workstation-grade laptops and desktops and people are plugging in all
> kinds of normal USB devices and expecting their bootloader not to freak
> out with them (even though most of the time it doesn't need to talk to
> them). It's really easy to run into this stuff when that's your use case.
It is really appreciated to see people fix things like this stuff, thanks !
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] fixup! usb: xhci: Guard all calls to xhci_wait_for_event
2023-10-29 21:25 ` Marek Vasut
@ 2023-10-29 21:33 ` Peter Robinson
2023-10-29 22:01 ` Tony Dinh
0 siblings, 1 reply; 25+ messages in thread
From: Peter Robinson @ 2023-10-29 21:33 UTC (permalink / raw)
To: Marek Vasut
Cc: Hector Martin, Bin Meng, Mark Kettenis, u-boot, asahi, Minda Chen
On Sun, Oct 29, 2023 at 9:25 PM Marek Vasut <marex@denx.de> wrote:
>
> On 10/27/23 08:03, Hector Martin wrote:
> > On 27/10/2023 09.36, Marek Vasut wrote:
> >> On 10/27/23 01:26, Hector Martin wrote:
> >>> Gah, I should've paid more attention to that rebase. Here's a dumb
> >>> fixup for this patch. I'll squash it into a v2 if needed alongside
> >>> any other changes, or if it looks good feel free to apply/squash
> >>> it in directly.
> >>>
> >>> ---
> >>> drivers/usb/host/xhci-ring.c | 1 +
> >>> 1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> >>> index e2bd2e999a8e..7f2507be0cf0 100644
> >>> --- a/drivers/usb/host/xhci-ring.c
> >>> +++ b/drivers/usb/host/xhci-ring.c
> >>> @@ -532,6 +532,7 @@ static void reset_ep(struct usb_device *udev, int ep_index)
> >>> event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
> >>> if (!event)
> >>> return;
> >>> + field = le32_to_cpu(event->trans_event.flags);
> >>> BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
> >>> xhci_acknowledge_event(ctrl);
> >>
> >> Please squash, and add
> >>
> >> Reviewed-by: Marek Vasut <marex@denx.de>
> >>
> >> Also, +CC Minda,
> >>
> >> there has been a similar fix to this one but with much more information
> >> about the failure, see:
> >>
> >> [PATCH v1] usb: xhci: Check return value of wait for TRB_TRANSFER event
> >>
> >> I think it would be useful to somehow include that information, so it
> >> wouldn't be lost.
> >
> > The root cause for *that* failure is what I fix in patch #5. From that
> > thread:
> >
> >> scanning bus xhci_pci for devices... WARN halted endpoint, queueing
> > URB anyway.
> >
> > Without #5 and without #1, that situation then leads to fireworks.
> >
> > A bunch of things are broken, which is why this is a series and not a
> > single patch. I have more fixes to timeout handling, mass storage, etc.
> > coming up after this. I fixed most of this in one long day of trying
> > random USB devices, so it's not so much subtle super specific problems I
> > could document the crashes for as just wide-ranging problems in the
> > u-boot USB stack. None of this is particularly hard to repro if you just
> > try a bunch of normal consumer USB devices, including things like USB
> > HDDs which take time to spin-up leading to timeouts etc.
>
> I think majority of users I can think of use USB mass storage devices,
> like USB pen drives, not so much HDDs as far as I can tell.
We see a bunch of spinning rust users in Fedora, these sorts of
devices are used by a bunch of people that want to run up cheap home
NAS devices so from experience I'd say while not usual to be USB
sticks and some form of solid state storage, spinning disk isn't
unusual.
> > The crash dumps
> > aren't particularly useful, it's the subtleties of the xHCI interactions
> > that are, but for that you need to add and enable a lot more debugging
> > (patch #8).
>
> The crash dumps are more for posterity, when someone searches for a
> problem they see. Search engines tend to pick those up and it might
> point those people in the right direction.
>
> Also, it is good to include information about what triggered the crash,
> e.g. which USB device on which hardware, in case this is needed in the
> future.
>
> > I think the main reason all this stuff is broken and we're finding out
> > now is that u-boot has traditionally been used in embedded devices with
> > fairly narrow use cases for USB
>
> Yes
>
> >, and now we're running it on
> > workstation-grade laptops and desktops and people are plugging in all
> > kinds of normal USB devices and expecting their bootloader not to freak
> > out with them (even though most of the time it doesn't need to talk to
> > them). It's really easy to run into this stuff when that's your use case.
>
> It is really appreciated to see people fix things like this stuff, thanks !
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] fixup! usb: xhci: Guard all calls to xhci_wait_for_event
2023-10-29 21:33 ` Peter Robinson
@ 2023-10-29 22:01 ` Tony Dinh
0 siblings, 0 replies; 25+ messages in thread
From: Tony Dinh @ 2023-10-29 22:01 UTC (permalink / raw)
To: Peter Robinson
Cc: Marek Vasut, Hector Martin, Bin Meng, Mark Kettenis, u-boot,
asahi, Minda Chen
On Sun, Oct 29, 2023 at 2:33 PM Peter Robinson <pbrobinson@gmail.com> wrote:
>
> On Sun, Oct 29, 2023 at 9:25 PM Marek Vasut <marex@denx.de> wrote:
> >
> > On 10/27/23 08:03, Hector Martin wrote:
> > > On 27/10/2023 09.36, Marek Vasut wrote:
> > >> On 10/27/23 01:26, Hector Martin wrote:
> > >>> Gah, I should've paid more attention to that rebase. Here's a dumb
> > >>> fixup for this patch. I'll squash it into a v2 if needed alongside
> > >>> any other changes, or if it looks good feel free to apply/squash
> > >>> it in directly.
> > >>>
> > >>> ---
> > >>> drivers/usb/host/xhci-ring.c | 1 +
> > >>> 1 file changed, 1 insertion(+)
> > >>>
> > >>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> > >>> index e2bd2e999a8e..7f2507be0cf0 100644
> > >>> --- a/drivers/usb/host/xhci-ring.c
> > >>> +++ b/drivers/usb/host/xhci-ring.c
> > >>> @@ -532,6 +532,7 @@ static void reset_ep(struct usb_device *udev, int ep_index)
> > >>> event = xhci_wait_for_event(ctrl, TRB_COMPLETION);
> > >>> if (!event)
> > >>> return;
> > >>> + field = le32_to_cpu(event->trans_event.flags);
> > >>> BUG_ON(TRB_TO_SLOT_ID(field) != udev->slot_id);
> > >>> xhci_acknowledge_event(ctrl);
> > >>
> > >> Please squash, and add
> > >>
> > >> Reviewed-by: Marek Vasut <marex@denx.de>
> > >>
> > >> Also, +CC Minda,
> > >>
> > >> there has been a similar fix to this one but with much more information
> > >> about the failure, see:
> > >>
> > >> [PATCH v1] usb: xhci: Check return value of wait for TRB_TRANSFER event
> > >>
> > >> I think it would be useful to somehow include that information, so it
> > >> wouldn't be lost.
> > >
> > > The root cause for *that* failure is what I fix in patch #5. From that
> > > thread:
> > >
> > >> scanning bus xhci_pci for devices... WARN halted endpoint, queueing
> > > URB anyway.
> > >
> > > Without #5 and without #1, that situation then leads to fireworks.
> > >
> > > A bunch of things are broken, which is why this is a series and not a
> > > single patch. I have more fixes to timeout handling, mass storage, etc.
> > > coming up after this. I fixed most of this in one long day of trying
> > > random USB devices, so it's not so much subtle super specific problems I
> > > could document the crashes for as just wide-ranging problems in the
> > > u-boot USB stack. None of this is particularly hard to repro if you just
> > > try a bunch of normal consumer USB devices, including things like USB
> > > HDDs which take time to spin-up leading to timeouts etc.
> >
> > I think majority of users I can think of use USB mass storage devices,
> > like USB pen drives, not so much HDDs as far as I can tell.
>
> We see a bunch of spinning rust users in Fedora, these sorts of
> devices are used by a bunch of people that want to run up cheap home
> NAS devices so from experience I'd say while not usual to be USB
> sticks and some form of solid state storage, spinning disk isn't
> unusual.
What Peter said. A very common use case, even more so than USB flash
drives, is for the consumers to plug in a USB HDD to their routers or
home NAS, as a cheap and quick solution. I've seen this type of
timeout failure happen quite often with large >= 3TB HDDs in USB
enclosure.
All the best,
Tony
>
> > > The crash dumps
> > > aren't particularly useful, it's the subtleties of the xHCI interactions
> > > that are, but for that you need to add and enable a lot more debugging
> > > (patch #8).
> >
> > The crash dumps are more for posterity, when someone searches for a
> > problem they see. Search engines tend to pick those up and it might
> > point those people in the right direction.
> >
> > Also, it is good to include information about what triggered the crash,
> > e.g. which USB device on which hardware, in case this is needed in the
> > future.
> >
> > > I think the main reason all this stuff is broken and we're finding out
> > > now is that u-boot has traditionally been used in embedded devices with
> > > fairly narrow use cases for USB
> >
> > Yes
> >
> > >, and now we're running it on
> > > workstation-grade laptops and desktops and people are plugging in all
> > > kinds of normal USB devices and expecting their bootloader not to freak
> > > out with them (even though most of the time it doesn't need to talk to
> > > them). It's really easy to run into this stuff when that's your use case.
> >
> > It is really appreciated to see people fix things like this stuff, thanks !
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
` (8 preceding siblings ...)
2023-10-27 18:58 ` [PATCH 0/8] USB fixes: xHCI error handling Neal Gompa
@ 2023-11-19 20:08 ` Marek Vasut
2023-11-19 23:17 ` Shantur Rathore
9 siblings, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2023-11-19 20:08 UTC (permalink / raw)
To: Hector Martin, Bin Meng; +Cc: Mark Kettenis, u-boot, asahi
On 10/27/23 01:16, Hector Martin wrote:
> This series is the first of a few bundles of USB fixes we have been
> carrying downstream on the Asahi U-Boot branch for a few months.
>
> Most importantly, this related set of patches makes xHCI error/stall
> recovery more robust (or work at all in some cases). There are also a
> couple patches fixing other xHCI bugs and adding better debug logs.
>
> I believe this should fix this Fedora bug too:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=2244305
Was there ever a V2 of these patches I might've missed ?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-11-19 20:08 ` Marek Vasut
@ 2023-11-19 23:17 ` Shantur Rathore
2023-11-20 2:09 ` Marek Vasut
0 siblings, 1 reply; 25+ messages in thread
From: Shantur Rathore @ 2023-11-19 23:17 UTC (permalink / raw)
To: Marek Vasut; +Cc: Hector Martin, Bin Meng, Mark Kettenis, u-boot, asahi
On Sun, Nov 19, 2023 at 8:08 PM Marek Vasut <marex@denx.de> wrote:
>
> On 10/27/23 01:16, Hector Martin wrote:
> > This series is the first of a few bundles of USB fixes we have been
> > carrying downstream on the Asahi U-Boot branch for a few months.
> >
> > Most importantly, this related set of patches makes xHCI error/stall
> > recovery more robust (or work at all in some cases). There are also a
> > couple patches fixing other xHCI bugs and adding better debug logs.
> >
> > I believe this should fix this Fedora bug too:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=2244305
>
> Was there ever a V2 of these patches I might've missed ?
Is it this one?
https://patchwork.ozlabs.org/project/uboot/list/?series=379807
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-11-19 23:17 ` Shantur Rathore
@ 2023-11-20 2:09 ` Marek Vasut
2023-11-20 10:45 ` Hector Martin
0 siblings, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2023-11-20 2:09 UTC (permalink / raw)
To: Shantur Rathore; +Cc: Hector Martin, Bin Meng, Mark Kettenis, u-boot, asahi
On 11/20/23 00:17, Shantur Rathore wrote:
> On Sun, Nov 19, 2023 at 8:08 PM Marek Vasut <marex@denx.de> wrote:
>>
>> On 10/27/23 01:16, Hector Martin wrote:
>>> This series is the first of a few bundles of USB fixes we have been
>>> carrying downstream on the Asahi U-Boot branch for a few months.
>>>
>>> Most importantly, this related set of patches makes xHCI error/stall
>>> recovery more robust (or work at all in some cases). There are also a
>>> couple patches fixing other xHCI bugs and adding better debug logs.
>>>
>>> I believe this should fix this Fedora bug too:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=2244305
>>
>> Was there ever a V2 of these patches I might've missed ?
>
> Is it this one?
> https://patchwork.ozlabs.org/project/uboot/list/?series=379807
I think so, thanks.
And uh, my question therefore it, is there a V3 which addresses the 3/8
and 8/8 comment ?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-11-20 2:09 ` Marek Vasut
@ 2023-11-20 10:45 ` Hector Martin
2023-11-20 12:15 ` Marek Vasut
0 siblings, 1 reply; 25+ messages in thread
From: Hector Martin @ 2023-11-20 10:45 UTC (permalink / raw)
To: Marek Vasut, Shantur Rathore; +Cc: Bin Meng, Mark Kettenis, u-boot, asahi
On 2023/11/20 11:09, Marek Vasut wrote:
> On 11/20/23 00:17, Shantur Rathore wrote:
>> On Sun, Nov 19, 2023 at 8:08 PM Marek Vasut <marex@denx.de> wrote:
>>>
>>> On 10/27/23 01:16, Hector Martin wrote:
>>>> This series is the first of a few bundles of USB fixes we have been
>>>> carrying downstream on the Asahi U-Boot branch for a few months.
>>>>
>>>> Most importantly, this related set of patches makes xHCI error/stall
>>>> recovery more robust (or work at all in some cases). There are also a
>>>> couple patches fixing other xHCI bugs and adding better debug logs.
>>>>
>>>> I believe this should fix this Fedora bug too:
>>>>
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2244305
>>>
>>> Was there ever a V2 of these patches I might've missed ?
>>
>> Is it this one?
>> https://patchwork.ozlabs.org/project/uboot/list/?series=379807
>
> I think so, thanks.
>
> And uh, my question therefore it, is there a V3 which addresses the 3/8
> and 8/8 comment ?
Not yet, no. Sorry, I probably won't have time to work on this in a
while, currently busy with other stuff.
- Hector
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-11-20 10:45 ` Hector Martin
@ 2023-11-20 12:15 ` Marek Vasut
2023-11-21 6:46 ` Hector Martin
0 siblings, 1 reply; 25+ messages in thread
From: Marek Vasut @ 2023-11-20 12:15 UTC (permalink / raw)
To: Hector Martin, Shantur Rathore; +Cc: Bin Meng, Mark Kettenis, u-boot, asahi
On 11/20/23 11:45, Hector Martin wrote:
>
>
> On 2023/11/20 11:09, Marek Vasut wrote:
>> On 11/20/23 00:17, Shantur Rathore wrote:
>>> On Sun, Nov 19, 2023 at 8:08 PM Marek Vasut <marex@denx.de> wrote:
>>>>
>>>> On 10/27/23 01:16, Hector Martin wrote:
>>>>> This series is the first of a few bundles of USB fixes we have been
>>>>> carrying downstream on the Asahi U-Boot branch for a few months.
>>>>>
>>>>> Most importantly, this related set of patches makes xHCI error/stall
>>>>> recovery more robust (or work at all in some cases). There are also a
>>>>> couple patches fixing other xHCI bugs and adding better debug logs.
>>>>>
>>>>> I believe this should fix this Fedora bug too:
>>>>>
>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2244305
>>>>
>>>> Was there ever a V2 of these patches I might've missed ?
>>>
>>> Is it this one?
>>> https://patchwork.ozlabs.org/project/uboot/list/?series=379807
>>
>> I think so, thanks.
>>
>> And uh, my question therefore it, is there a V3 which addresses the 3/8
>> and 8/8 comment ?
>
> Not yet, no. Sorry, I probably won't have time to work on this in a
> while, currently busy with other stuff.
I can probably fix the patches up myself if that is fine with you, I'd
really like to get these fixes into the release soon. Would that be OK
with you ?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-11-20 12:15 ` Marek Vasut
@ 2023-11-21 6:46 ` Hector Martin
2023-11-22 23:52 ` Marek Vasut
0 siblings, 1 reply; 25+ messages in thread
From: Hector Martin @ 2023-11-21 6:46 UTC (permalink / raw)
To: Marek Vasut, Shantur Rathore; +Cc: Bin Meng, Mark Kettenis, u-boot, asahi
On 2023/11/20 21:15, Marek Vasut wrote:
> On 11/20/23 11:45, Hector Martin wrote:
>>
>>
>> On 2023/11/20 11:09, Marek Vasut wrote:
>>> On 11/20/23 00:17, Shantur Rathore wrote:
>>>> On Sun, Nov 19, 2023 at 8:08 PM Marek Vasut <marex@denx.de> wrote:
>>>>>
>>>>> On 10/27/23 01:16, Hector Martin wrote:
>>>>>> This series is the first of a few bundles of USB fixes we have been
>>>>>> carrying downstream on the Asahi U-Boot branch for a few months.
>>>>>>
>>>>>> Most importantly, this related set of patches makes xHCI error/stall
>>>>>> recovery more robust (or work at all in some cases). There are also a
>>>>>> couple patches fixing other xHCI bugs and adding better debug logs.
>>>>>>
>>>>>> I believe this should fix this Fedora bug too:
>>>>>>
>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2244305
>>>>>
>>>>> Was there ever a V2 of these patches I might've missed ?
>>>>
>>>> Is it this one?
>>>> https://patchwork.ozlabs.org/project/uboot/list/?series=379807
>>>
>>> I think so, thanks.
>>>
>>> And uh, my question therefore it, is there a V3 which addresses the 3/8
>>> and 8/8 comment ?
>>
>> Not yet, no. Sorry, I probably won't have time to work on this in a
>> while, currently busy with other stuff.
>
> I can probably fix the patches up myself if that is fine with you, I'd
> really like to get these fixes into the release soon. Would that be OK
> with you ?
>
Of course, I would appreciate that :)
- Hector
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/8] USB fixes: xHCI error handling
2023-11-21 6:46 ` Hector Martin
@ 2023-11-22 23:52 ` Marek Vasut
0 siblings, 0 replies; 25+ messages in thread
From: Marek Vasut @ 2023-11-22 23:52 UTC (permalink / raw)
To: Hector Martin, Shantur Rathore; +Cc: Bin Meng, Mark Kettenis, u-boot, asahi
On 11/21/23 07:46, Hector Martin wrote:
>
>
> On 2023/11/20 21:15, Marek Vasut wrote:
>> On 11/20/23 11:45, Hector Martin wrote:
>>>
>>>
>>> On 2023/11/20 11:09, Marek Vasut wrote:
>>>> On 11/20/23 00:17, Shantur Rathore wrote:
>>>>> On Sun, Nov 19, 2023 at 8:08 PM Marek Vasut <marex@denx.de> wrote:
>>>>>>
>>>>>> On 10/27/23 01:16, Hector Martin wrote:
>>>>>>> This series is the first of a few bundles of USB fixes we have been
>>>>>>> carrying downstream on the Asahi U-Boot branch for a few months.
>>>>>>>
>>>>>>> Most importantly, this related set of patches makes xHCI error/stall
>>>>>>> recovery more robust (or work at all in some cases). There are also a
>>>>>>> couple patches fixing other xHCI bugs and adding better debug logs.
>>>>>>>
>>>>>>> I believe this should fix this Fedora bug too:
>>>>>>>
>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2244305
>>>>>>
>>>>>> Was there ever a V2 of these patches I might've missed ?
>>>>>
>>>>> Is it this one?
>>>>> https://patchwork.ozlabs.org/project/uboot/list/?series=379807
>>>>
>>>> I think so, thanks.
>>>>
>>>> And uh, my question therefore it, is there a V3 which addresses the 3/8
>>>> and 8/8 comment ?
>>>
>>> Not yet, no. Sorry, I probably won't have time to work on this in a
>>> while, currently busy with other stuff.
>>
>> I can probably fix the patches up myself if that is fine with you, I'd
>> really like to get these fixes into the release soon. Would that be OK
>> with you ?
>>
>
> Of course, I would appreciate that :)
I picked a subset to usb/master and fixed a subset, for the rest I
really do need your input on those patches.
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2023-11-22 23:53 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-26 23:16 [PATCH 0/8] USB fixes: xHCI error handling Hector Martin
2023-10-26 23:16 ` [PATCH 1/8] usb: xhci: Guard all calls to xhci_wait_for_event Hector Martin
2023-10-26 23:26 ` [PATCH] fixup! " Hector Martin
2023-10-27 0:36 ` Marek Vasut
2023-10-27 6:03 ` Hector Martin
2023-10-29 21:25 ` Marek Vasut
2023-10-29 21:33 ` Peter Robinson
2023-10-29 22:01 ` Tony Dinh
2023-10-26 23:16 ` [PATCH 2/8] usb: xhci: Better error handling in abort_td() Hector Martin
2023-10-27 0:52 ` Marek Vasut
2023-10-26 23:16 ` [PATCH 3/8] usb: xhci: Allow context state errors when halting an endpoint Hector Martin
2023-10-26 23:16 ` [PATCH 4/8] usb: xhci: Recover from halted non-control endpoints Hector Martin
2023-10-26 23:16 ` [PATCH 5/8] usb: xhci: Fail on attempt to queue TRBs to a halted endpoint Hector Martin
2023-10-26 23:16 ` [PATCH 6/8] usb: xhci: Do not panic on event timeouts Hector Martin
2023-10-26 23:16 ` [PATCH 7/8] usb: xhci: Fix DMA address calculation in queue_trb Hector Martin
2023-10-26 23:16 ` [PATCH 8/8] usb: xhci: Add more debugging Hector Martin
2023-10-27 18:58 ` [PATCH 0/8] USB fixes: xHCI error handling Neal Gompa
2023-10-27 19:42 ` Neal Gompa
2023-11-19 20:08 ` Marek Vasut
2023-11-19 23:17 ` Shantur Rathore
2023-11-20 2:09 ` Marek Vasut
2023-11-20 10:45 ` Hector Martin
2023-11-20 12:15 ` Marek Vasut
2023-11-21 6:46 ` Hector Martin
2023-11-22 23:52 ` Marek Vasut
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox