linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: linux-usb@vger.kernel.org
Subject: [Bug 220069] [6.13.9] regression USB controller dies
Date: Thu, 01 May 2025 09:48:15 +0000	[thread overview]
Message-ID: <bug-220069-208809-5YDVZMFW4Q@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-220069-208809@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=220069

--- Comment #14 from Claudio Wunder (cwunder@gnome.org) ---
> "xHCI host not responding to stop endpoint command"

The closest log from all my previously stored logs within journald to "xHCI
host not responding to stop endpoint command" is "Apr 21 11:23:10.258699
angel-thesis kernel: xhci_hcd 0000:6a:00.0: xHCI host controller not
responding, assume dead"

Here are some more logs:

```
Apr 21 03:14:59 angel-thesis kernel: xhci_hcd 0000:6a:00.0: Event TRB for slot
4 ep 0 with no TDs queued
Apr 21 04:20:16 angel-thesis kernel: xhci_hcd 0000:6a:00.0: ERROR unknown event
type 4
Apr 21 04:20:16 angel-thesis kernel: xhci_hcd 0000:6a:00.0: Abort failed to
stop command ring: -110
Apr 21 04:20:16 angel-thesis kernel: xhci_hcd 0000:6a:00.0: xHCI host
controller not responding, assume dead
Apr 21 04:20:16 angel-thesis kernel: xhci_hcd 0000:6a:00.0: HC died; cleaning
up
Apr 21 04:20:16 angel-thesis kernel: xhci_hcd 0000:6a:00.0: Timeout while
waiting for setup device command
```

> if you also saw "ERROR unknown event type <some number>" a few seconds before
> "HC died". Do you still have those logs by any chance?

Yes!

```
Apr 26 23:57:40 angel-thesis kernel: xhci_hcd 0000:6b:00.0: Event TRB for slot
4 ep 0 with no TDs queued
Apr 26 23:57:40 angel-thesis kernel: usb 8-3: Device not responding to setup
address.
Apr 26 23:57:55 angel-thesis kernel: xhci_hcd 0000:6b:00.0: ERROR unknown event
type 4
Apr 26 23:57:55 angel-thesis kernel: xhci_hcd 0000:6b:00.0: Abort failed to
stop command ring: -110
Apr 26 23:57:55 angel-thesis kernel: xhci_hcd 0000:6b:00.0: xHCI host
controller not responding, assume dead
Apr 26 23:57:55 angel-thesis kernel: xhci_hcd 0000:6b:00.0: HC died; cleaning
up
Apr 26 23:57:55 angel-thesis kernel: xhci_hcd 0000:6b:00.0: Timeout while
waiting for setup device command
Apr 26 23:57:55 angel-thesis kernel: usb 7-2: USB disconnect, device number 2
Apr 26 23:57:55 angel-thesis kernel: usb 7-2.3: USB disconnect, device number 5
Apr 26 23:57:55 angel-thesis kernel: usb 7-2.3.1: USB disconnect, device number
10
Apr 26 23:57:55 angel-thesis kernel: usb 7-2.4: USB disconnect, device number 8
Apr 26 23:57:55 angel-thesis kernel: usb 7-2.5: USB disconnect, device number
13
Apr 26 23:57:55 angel-thesis kernel: usb 7-3: USB disconnect, device number 3
Apr 26 23:57:55 angel-thesis kernel: usb 7-3.1: USB disconnect, device number 7
Apr 26 23:57:55 angel-thesis kernel: usb 7-3.3: USB disconnect, device number
12
Apr 26 23:57:55 angel-thesis kernel: usb 7-3.4: USB disconnect, device number
14
Apr 26 23:57:55 angel-thesis kernel: usb 7-3.4.2: USB disconnect, device number
16
Apr 26 23:57:55 angel-thesis kernel: usb 7-3.5: USB disconnect, device number
15
Apr 26 23:57:55 angel-thesis kernel: usb 7-5: USB disconnect, device number 4
Apr 26 23:57:55 angel-thesis kernel: usb 7-5.2: USB disconnect, device number 9
Apr 26 23:57:55 angel-thesis kernel: usb 7-7: USB disconnect, device number 6
Apr 26 23:57:55 angel-thesis kernel: usb 7-11: USB disconnect, device number 11
Apr 26 23:57:55 angel-thesis kernel: usb 8-3: device not accepting address 3,
error -62
Apr 26 23:57:55 angel-thesis kernel: usb 8-3: USB disconnect, device number 3
Apr 26 23:57:55 angel-thesis kernel: usb 8-3.4: USB disconnect, device number 5
Apr 26 23:57:55 angel-thesis kernel: usb usb8-port3: couldn't allocate
usb_device
Apr 26 23:57:55 angel-thesis kernel: usb 8-2: USB disconnect, device number 2
Apr 26 23:57:55 angel-thesis kernel: usb 8-5: USB disconnect, device number 4
```

```
Apr 28 18:54:12 angel-thesis kernel: xhci_hcd 0000:6a:00.0: Event TRB for slot
18 ep 0 with no TDs queued
Apr 28 18:54:12 angel-thesis kernel: usb 8-3: Device not responding to setup
address.
Apr 28 18:54:28 angel-thesis kernel: xhci_hcd 0000:6a:00.0: ERROR unknown event
type 4
Apr 28 18:54:28 angel-thesis kernel: xhci_hcd 0000:6a:00.0: Abort failed to
stop command ring: -110
Apr 28 18:54:28 angel-thesis kernel: xhci_hcd 0000:6a:00.0: xHCI host
controller not responding, assume dead
Apr 28 18:54:28 angel-thesis kernel: xhci_hcd 0000:6a:00.0: HC died; cleaning
up
Apr 28 18:54:28 angel-thesis kernel: xhci_hcd 0000:6a:00.0: Timeout while
waiting for setup device command
Apr 28 18:54:28 angel-thesis kernel: usb 7-2: USB disconnect, device number 2
Apr 28 18:54:28 angel-thesis kernel: usb 7-2.3: USB disconnect, device number 5
Apr 28 18:54:28 angel-thesis kernel: usb 7-2.3.1: USB disconnect, device number
10
Apr 28 18:54:28 angel-thesis kernel: usb 7-2.3.2: USB disconnect, device number
13
Apr 28 18:54:28 angel-thesis kernel: usb 7-2.5: USB disconnect, device number 8
Apr 28 18:54:28 angel-thesis kernel: usb 7-3: USB disconnect, device number 3
Apr 28 18:54:28 angel-thesis kernel: usb 7-3.1: USB disconnect, device number 7
Apr 28 18:54:28 angel-thesis kernel: usb 8-3: device not accepting address 3,
error -62
Apr 28 18:54:28 angel-thesis kernel: usb 8-3: USB disconnect, device number 3
Apr 28 18:54:28 angel-thesis kernel: usb 8-3.4: USB disconnect, device number 5
Apr 28 18:54:28 angel-thesis kernel: usb usb8-port3: couldn't allocate
usb_device
Apr 28 18:54:28 angel-thesis kernel: usb 8-2: USB disconnect, device number 2
Apr 28 18:54:28 angel-thesis kernel: usb 8-5: USB disconnect, device number 4
Apr 28 18:54:28 angel-thesis kernel: usb 7-3.3: USB disconnect, device number
12
Apr 28 18:54:28 angel-thesis kernel: usb 7-3.4: USB disconnect, device number
14
Apr 28 18:54:28 angel-thesis kernel: usb 7-3.5: USB disconnect, device number
15
Apr 28 18:54:28 angel-thesis kernel: usb 7-5: USB disconnect, device number 4
Apr 28 18:54:28 angel-thesis kernel: usb 7-5.2: USB disconnect, device number 9
Apr 28 18:54:28 angel-thesis kernel: usb 7-7: USB disconnect, device number 6
Apr 28 18:54:28 angel-thesis kernel: usb 7-11: USB disconnect, device number 11
```

And it happens indeed a few seconds afterwards.

> So if you ever seem command abort timeout, either the abort code is buggy
> (and it looks like no one touched that part in ages) or the chip is buggy in
> one way or another.

That's interesting.

> These are present in all 6.12 and higher releases from this year, so the only
> supported kernels without them are old LTS series. Not sure if you have means
> of testing those for a few weeks on the same HW, userspace and workload?

Ill wait for the issue to happen again, so I can at least upload the debugfs;
Then Ill attempt to switch to an older Kernel version (6.12.X) if needed.

> I could also suggest some stress tests which exercise this code (and the USB
> controller). I found webcams and USB serial dongles to be particularly
> suitable, do you have some of such stuff at hand?

You mean simulated code? Like Prime95? On the dongles/connected USB devices
(here's a screenshot of the USB Devices Tree:
https://gist.github.com/ovflowd/0b0aa5c748683eca33909dc3ed7c66f7#file-screenshot_20250501_113016-png)

But pretty much:

- There's a RodeCaster Duo connected to one of the USB rear ports
 - Note that this has two (2) USB-out ports to connect to two devices;
- There's a KVM switch from Anker connected to another USB port (Model Number:
A83K3) with a keyboard (Wooting 60HE+ and a Logitech Bolt dongle connected to
it (mouse wireless dongle))
  - 2nd RodeCaster Duo USB port also connected there.
- I'm using a Monitor with an USB-B hub (Supposedly USB 3.2, per monitor
settings, but Linux recognises it as a USB 2.0, possibly because bandwidth
negotiations are at 2.0 speeds; either with webcam on idle or non-idle) where a
webcam (Insta 360 Link 2C) indeed is connected.

To be honest, neither of these devices are bandwidth hungry, even the webcam is
capped at 2K but always on 1080p/i

There are a bunch of integrated peripherals appearing there such as ASMedia's
ASM Controller (ASM107x whatever that is, and seems to be shared on two xHCI
controllers); The Bluetooth Controller, LED controller and the AIO Pump
Controller.

It is really hard to say if any of these devices are somehow crashing the xHCI
controller, and I believe it might be crashing a specific one? For example,
Audio on my RodeCaster Duo and Bluetooth keep working when that said crash
happens (not sure if this is important info), but all other devices (like my
mouse, keyboard) stop working (I already tried to plug on different front
ports, but not rear ports)

And from the Logs, it is exactly the controller that all my peripherals besides
the back port of my RodeCaster Duo is.

```
Manufacturer: Linux 6.13.9-103.bazzite.fc42.x86_64 xhci-hcd
Serial #: 0000:6a:00.0
```

(This one contains all my devices, mouse, etc); Except for Bluetooth and the
1st RodeCaster Duo port.

```
Manufacturer: Linux 6.13.9-103.bazzite.fc42.x86_64 xhci-hcd
Serial #: 0000:68:00.0
```

And since all integrated mobo peripherals are on the former one, I'm assuming
maybe it could be related to some integrated hardware, as you mentioned before?

It's really hard to know now without the logs, so I'll stop my assumptions.

> A simpler thing is to try different USB ports (rear or front panel) and see
> if any are connected to different (probably in-CPU) controllers.

Yeah, that's what I described above. 

> Your problem seems to be HW specific, because others generally stopped
> complaining after 6.13.7. I have heard about one more case of "Abort failed
> to stop command ring: -110" and suggested filing a bug here, but the reporter
> never did.

I worry I am wasting too much of your time, tbh. Genuinely speaking, no idea
what's going on besides of "I definitely would like a solution for this" and
contribute as much as I can with reporting a bug that may or may not be
affecting other users.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

  parent reply	other threads:[~2025-05-01  9:48 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-29 14:28 [Bug 220069] New: [6.13.9] regression USB controller dies bugzilla-daemon
2025-04-29 20:07 ` [Bug 220069] " bugzilla-daemon
2025-04-29 20:43 ` bugzilla-daemon
2025-04-29 20:45 ` bugzilla-daemon
2025-04-30  5:57 ` bugzilla-daemon
2025-04-30  8:57 ` bugzilla-daemon
2025-04-30  9:42 ` bugzilla-daemon
2025-04-30  9:49 ` bugzilla-daemon
2025-04-30  9:53 ` bugzilla-daemon
2025-04-30 21:36 ` bugzilla-daemon
2025-04-30 22:36 ` bugzilla-daemon
2025-04-30 22:52 ` bugzilla-daemon
2025-05-01  7:29 ` bugzilla-daemon
2025-05-01  7:45 ` bugzilla-daemon
2025-05-01  9:48 ` bugzilla-daemon [this message]
2025-05-01 21:04 ` bugzilla-daemon
2025-05-01 21:43 ` bugzilla-daemon
2025-05-01 23:43 ` bugzilla-daemon
2025-05-02  0:07 ` bugzilla-daemon
2025-05-02  0:08 ` bugzilla-daemon
2025-05-02  9:28 ` bugzilla-daemon
2025-05-02 11:02 ` bugzilla-daemon
2025-05-02 11:03 ` bugzilla-daemon
2025-05-03  0:23 ` bugzilla-daemon
2025-05-03  8:08 ` bugzilla-daemon
2025-05-03 12:09 ` bugzilla-daemon
2025-05-03 12:10 ` bugzilla-daemon
2025-05-03 14:00 ` bugzilla-daemon
2025-05-03 17:49 ` bugzilla-daemon
2025-05-03 18:25 ` bugzilla-daemon
2025-05-03 19:09 ` bugzilla-daemon
2025-05-04 11:40 ` bugzilla-daemon
2025-05-04 12:41 ` bugzilla-daemon
2025-05-04 13:58 ` bugzilla-daemon
2025-05-04 14:17 ` bugzilla-daemon
2025-05-04 17:23 ` bugzilla-daemon
2025-05-05  8:49 ` bugzilla-daemon
2025-05-05  8:58 ` bugzilla-daemon
2025-05-05  9:13 ` bugzilla-daemon
2025-05-05  9:32 ` bugzilla-daemon
2025-05-05  9:40 ` bugzilla-daemon
2025-05-05  9:48 ` bugzilla-daemon
2025-05-05 10:53 ` bugzilla-daemon
2025-05-05 14:07 ` bugzilla-daemon
2025-05-05 21:22 ` bugzilla-daemon
2025-05-05 21:41 ` bugzilla-daemon
2025-05-07 23:02 ` bugzilla-daemon
2025-05-11 11:18 ` bugzilla-daemon
2025-05-11 12:49 ` bugzilla-daemon
2025-05-11 12:53 ` bugzilla-daemon
2025-05-11 12:58 ` bugzilla-daemon
2025-05-11 13:00 ` bugzilla-daemon
2025-05-15 10:05 ` bugzilla-daemon
2025-05-17 16:04 ` bugzilla-daemon
2025-05-18 23:37 ` bugzilla-daemon
2025-05-19  0:02 ` bugzilla-daemon
2025-05-19  0:13 ` bugzilla-daemon
2025-05-19  6:47 ` bugzilla-daemon
2025-05-19 12:08 ` bugzilla-daemon
2025-05-20 16:18 ` bugzilla-daemon
2025-05-20 16:22 ` bugzilla-daemon
2025-05-23 21:43 ` bugzilla-daemon
2025-05-23 21:44 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-220069-208809-5YDVZMFW4Q@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).