All of lore.kernel.org
 help / color / mirror / Atom feed
* HC died
@ 2023-02-17 14:21 Seth Bollinger
  2023-02-21 16:26 ` Mathias Nyman
  0 siblings, 1 reply; 9+ messages in thread
From: Seth Bollinger @ 2023-02-17 14:21 UTC (permalink / raw)
  To: linux-usb; +Cc: sethb

Hello All,

We're experiencing a problem with our devices in the field where our
customers attach problematic USB devices that are causing the xhci
host controller to shut down with the "HC died; cleaning up" message.

I've narrowed this down to a timeout of the address device TRB on the
command ring (currently 5 seconds).  It sometimes takes our hardware
9.6 to complete this TRB.  When the driver is trying to stop the cmd
ring, the controller is busy for an additional 4.6 seconds.  This
results in the "HC died" message and shutdown of the host controller.

If I bump the command ring timeout beyond the max TRB completion time,
the host controller continues to be responsive and doesn't need to be
shut down.

My knowledge of how the usb protocol should handle this problem isn't
strong enough to know if there is a better solution than simply
increasing the command ring default timeout.

Is there a better way to handle this problem?

I've reproduced this on kernel 6.0, and 6.1.

Here is a log of the issue, along with some extra tracing I added to
track the duration of the command ring TRB completions.  FYI, I
recreate this problem by connecting/disconnecting the 5V pin quickly.
The theory is that the device is no longer present on the bus when the
hardware is trying to address the device.  I understand this is
somewhat degenerate, but it's happening frequently to our customers in
the field and it would be nice if the HC didn't die requiring them to
reboot to recover.

[F00:P06]  Feb 16 16:40:14  kernel: usb 3-2.1: new high-speed USB
device number 13 using xhci_hcd
[F00:P07]  Feb 16 16:40:14  kernel: xhci_hcd 0002:01:00.0: Set root
hub portnum to 4
[F00:P07]  Feb 16 16:40:14  kernel: xhci_hcd 0002:01:00.0: Set fake
root hub portnum to 2
[F00:P07]  Feb 16 16:40:14  kernel: xhci_hcd 0002:01:00.0: udev->tt =
0000000000000000
[F00:P07]  Feb 16 16:40:14  kernel: xhci_hcd 0002:01:00.0: udev->ttport = 0x0
[F00:P06]  Feb 16 16:40:14  kernel: xhci_hcd 0002:01:00.0:
xhci_setup_device: xhci_queue_address_device
[F00:P07]  Feb 16 16:40:14  kernel: xhci_hcd 0002:01:00.0: // Ding dong!
[F00:P06]  Feb 16 16:40:23  kernel: xhci_hcd 0002:01:00.0: completion
cmd 11, time: 163776215, duration: 9596181 us
[F00:P04]  Feb 16 16:40:23  kernel: usb 3-2.1: Device not responding
to setup address.
[F00:P07]  Feb 16 16:40:23  kernel: xhci_hcd 0002:01:00.0: // Ding dong!
[F00:P06]  Feb 16 16:40:23  kernel: xhci_hcd 0002:01:00.0: completion
cmd 10, time: 163778807, duration: 2530 us
[F00:P06]  Feb 16 16:40:23  kernel: xhci_hcd 0002:01:00.0: completion
cmd 9, time: 163778906, duration: 90 us
[F00:P07]  Feb 16 16:40:23  kernel: xhci_hcd 0002:01:00.0: Slot 15
output ctx = 0x8422f000 (dma)
[F00:P07]  Feb 16 16:40:23  kernel: xhci_hcd 0002:01:00.0: Slot 15
input ctx = 0x87ea9000 (dma)
[F00:P07]  Feb 16 16:40:23  kernel: xhci_hcd 0002:01:00.0: Set slot id
15 dcbaa entry 000000001958e382 to 0x8422f000
[F00:P07]  Feb 16 16:40:24  kernel: xhci_hcd 0002:01:00.0: Set root
hub portnum to 4
[F00:P07]  Feb 16 16:40:24  kernel: xhci_hcd 0002:01:00.0: Set fake
root hub portnum to 2
[F00:P07]  Feb 16 16:40:24  kernel: xhci_hcd 0002:01:00.0: udev->tt =
0000000000000000
[F00:P07]  Feb 16 16:40:24  kernel: xhci_hcd 0002:01:00.0: udev->ttport = 0x0
[F00:P06]  Feb 16 16:40:24  kernel: xhci_hcd 0002:01:00.0:
xhci_setup_device: xhci_queue_address_device
[F00:P07]  Feb 16 16:40:24  kernel: xhci_hcd 0002:01:00.0: // Ding dong!
[F00:P06]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: completion
cmd 11, time: 173580444, duration: 9596416 us
[F00:P04]  Feb 16 16:40:33  kernel: usb 3-2.1: Device not responding
to setup address.
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: // Ding dong!
[F00:P06]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: completion
cmd 10, time: 173583035, duration: 2529 us
[F00:P06]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: completion
cmd 9, time: 173583129, duration: 83 us
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: Slot 16
output ctx = 0x8422f000 (dma)
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: Slot 16
input ctx = 0x87ea9000 (dma)
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: Set slot id
16 dcbaa entry 00000000ae4b7693 to 0x8422f000
[F00:P03]  Feb 16 16:40:33  kernel: usb 3-2.1: device not accepting
address 13, error -71
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: // Ding dong!
[F00:P06]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: completion
cmd 10, time: 173790654, duration: 2529 us
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: Bad real port.
[F00:P06]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: completion
cmd 9, time: 173790749, duration: 86 us
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: Slot 17
output ctx = 0x8422f000 (dma)
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: Slot 17
input ctx = 0x87ea9000 (dma)
[F00:P07]  Feb 16 16:40:33  kernel: xhci_hcd 0002:01:00.0: Set slot id
17 dcbaa entry 00000000e45e3c7b to 0x8422f000
[F00:P07]  Feb 16 16:40:34  kernel: xhci_hcd 0002:01:00.0: // Ding dong!
[F00:P06]  Feb 16 16:40:34  kernel: xhci_hcd 0002:01:00.0: completion
cmd 10, time: 174478971, duration: 2528 us
[F00:P07]  Feb 16 16:40:34  kernel: xhci_hcd 0002:01:00.0: Bad real port.

Thanks for your assistance,

Seth

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-02-28 14:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-17 14:21 HC died Seth Bollinger
2023-02-21 16:26 ` Mathias Nyman
2023-02-23 15:31   ` Seth Bollinger
2023-02-23 15:48     ` Alan Stern
2023-02-23 16:05       ` Seth Bollinger
2023-02-24 12:24         ` Mathias Nyman
2023-02-24 13:16           ` Seth Bollinger
2023-02-27 13:07             ` Mathias Nyman
2023-02-28 14:27               ` Seth Bollinger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.