linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: linux-usb@vger.kernel.org
Subject: [Bug 220069] [6.13.9] regression USB controller dies
Date: Sat, 03 May 2025 08:08:37 +0000	[thread overview]
Message-ID: <bug-220069-208809-3mowzvjn49@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-220069-208809@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=220069

--- Comment #24 from Michał Pecio (michal.pecio@gmail.com) ---
Great. By the way, did you suspend, reboot or unbind xhci_hcd after taking the
working system debugfs dumps and before it died? Unfortunately, the "dead" dump
is missing information about connected devices, because they got dropped after
"HC died".


Below are the final commands executed by the HC. Their cycle bits are all set
("flags C") and there is no evidence of Stop Endpoint retries anywhere in the
whole command ring dump, so this is not any of the known or obviously suspected
problems, but something new and weird.

Stop Ring Command: slot 19 sp 0 ep 3 flags C
Set TR Dequeue Pointer Command: deq 00000000ffead3c0 stream 0 slot 19 ep 3
flags C 
Stop Ring Command: slot 19 sp 1 ep 1 flags C
Stop Ring Command: slot 19 sp 0 ep 3 flags C
Set TR Dequeue Pointer Command: deq 00000000ffead3d0 stream 0 slot 19 ep 3
flags C 
Stop Ring Command: slot 19 sp 1 ep 1 flags C
Stop Ring Command: slot 6 sp 0 ep 5 flags C
Set TR Dequeue Pointer Command: deq 00000000fffddbc1 stream 0 slot 6 ep 5 flags
C
Stop Ring Command: slot 6 sp 0 ep 5 flags C
Set TR Dequeue Pointer Command: deq 00000000fffddbd1 stream 0 slot 6 ep 5 flags
C
Stop Ring Command: slot 6 sp 0 ep 5 flags C
Set TR Dequeue Pointer Command: deq 00000000fffddbe1 stream 0 slot 6 ep 5 flags
C
Stop Ring Command: slot 6 sp 0 ep 5 flags C
Set TR Dequeue Pointer Command: deq 00000000fffddbf1 stream 0 slot 6 ep 5 flags
C
Stop Ring Command: slot 6 sp 0 ep 5 flags C
Set TR Dequeue Pointer Command: deq 00000000fffddc01 stream 0 slot 6 ep 5 flags
C
Stop Ring Command: slot 6 sp 0 ep 5 flags C
Set TR Dequeue Pointer Command: deq 00000000fffddc11 stream 0 slot 6 ep 5 flags
C
Stop Ring Command: slot 19 sp 0 ep 1 flags C
Reset Device Command: slot 16 flags C
Reset Device Command: slot 19 flags C
Reset Device Command: slot 19 flags C
Address Device Command: ctx 00000000fff42000 slot 19 flags b:C
Disable Slot Command: slot 19 flags C
Enable Slot Command: flags C
Address Device Command: ctx 00000000fff42000 slot 20 flags b:C
Stop Ring Command: slot 6 sp 0 ep 5 flags C

Initially, we see the familiar pattern of canceling a pending transfer on slot
19 ep 3 and stopping slot 19 ep 1 (the control endpoint) with "sp 1", which is
a hint that the device will be suspended. This is probably the 8-3 hub again.

Then there is some action on slot 6 ep 5, which I don't understand because
information about devices is not available. In the earlier debugfs dump from a
working system slot 6 was the ASM107x hub, but endpoint id 5 was *not* enabled
on it, so that makes no sense.

Things begin to get unusual now: stop endpoint on slot 19 ep 1 with "sp 0",
then some devices are being reset. The last two commands fail to complete and
the HC hangs when the driver tries to abort them.

Looking at the event ring, the "unknown event type 4" actually points to the
Address Device command for slot 20, so maybe the HC completed this command (but
was already fubared enough to produce a corrupted event) and then got stuck for
real on the final command for slot 6.

But it was already fubared at this point, so something went wrong with those
resets or it was the slot 6 ep 5 churn which broke it. That looks like
repeatedly canceling a pending transfer before it completes and then
resubmitting a similar transfer, and IME such pattern can break "ass media" HCs
if they repeat fast enough... (no timestapms here, unfortunately).


Not entirely sure what to think about it yet, I will take a closer look at the
whole event ring later.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

  parent reply	other threads:[~2025-05-03  8:08 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-29 14:28 [Bug 220069] New: [6.13.9] regression USB controller dies bugzilla-daemon
2025-04-29 20:07 ` [Bug 220069] " bugzilla-daemon
2025-04-29 20:43 ` bugzilla-daemon
2025-04-29 20:45 ` bugzilla-daemon
2025-04-30  5:57 ` bugzilla-daemon
2025-04-30  8:57 ` bugzilla-daemon
2025-04-30  9:42 ` bugzilla-daemon
2025-04-30  9:49 ` bugzilla-daemon
2025-04-30  9:53 ` bugzilla-daemon
2025-04-30 21:36 ` bugzilla-daemon
2025-04-30 22:36 ` bugzilla-daemon
2025-04-30 22:52 ` bugzilla-daemon
2025-05-01  7:29 ` bugzilla-daemon
2025-05-01  7:45 ` bugzilla-daemon
2025-05-01  9:48 ` bugzilla-daemon
2025-05-01 21:04 ` bugzilla-daemon
2025-05-01 21:43 ` bugzilla-daemon
2025-05-01 23:43 ` bugzilla-daemon
2025-05-02  0:07 ` bugzilla-daemon
2025-05-02  0:08 ` bugzilla-daemon
2025-05-02  9:28 ` bugzilla-daemon
2025-05-02 11:02 ` bugzilla-daemon
2025-05-02 11:03 ` bugzilla-daemon
2025-05-03  0:23 ` bugzilla-daemon
2025-05-03  8:08 ` bugzilla-daemon [this message]
2025-05-03 12:09 ` bugzilla-daemon
2025-05-03 12:10 ` bugzilla-daemon
2025-05-03 14:00 ` bugzilla-daemon
2025-05-03 17:49 ` bugzilla-daemon
2025-05-03 18:25 ` bugzilla-daemon
2025-05-03 19:09 ` bugzilla-daemon
2025-05-04 11:40 ` bugzilla-daemon
2025-05-04 12:41 ` bugzilla-daemon
2025-05-04 13:58 ` bugzilla-daemon
2025-05-04 14:17 ` bugzilla-daemon
2025-05-04 17:23 ` bugzilla-daemon
2025-05-05  8:49 ` bugzilla-daemon
2025-05-05  8:58 ` bugzilla-daemon
2025-05-05  9:13 ` bugzilla-daemon
2025-05-05  9:32 ` bugzilla-daemon
2025-05-05  9:40 ` bugzilla-daemon
2025-05-05  9:48 ` bugzilla-daemon
2025-05-05 10:53 ` bugzilla-daemon
2025-05-05 14:07 ` bugzilla-daemon
2025-05-05 21:22 ` bugzilla-daemon
2025-05-05 21:41 ` bugzilla-daemon
2025-05-07 23:02 ` bugzilla-daemon
2025-05-11 11:18 ` bugzilla-daemon
2025-05-11 12:49 ` bugzilla-daemon
2025-05-11 12:53 ` bugzilla-daemon
2025-05-11 12:58 ` bugzilla-daemon
2025-05-11 13:00 ` bugzilla-daemon
2025-05-15 10:05 ` bugzilla-daemon
2025-05-17 16:04 ` bugzilla-daemon
2025-05-18 23:37 ` bugzilla-daemon
2025-05-19  0:02 ` bugzilla-daemon
2025-05-19  0:13 ` bugzilla-daemon
2025-05-19  6:47 ` bugzilla-daemon
2025-05-19 12:08 ` bugzilla-daemon
2025-05-20 16:18 ` bugzilla-daemon
2025-05-20 16:22 ` bugzilla-daemon
2025-05-23 21:43 ` bugzilla-daemon
2025-05-23 21:44 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-220069-208809-3mowzvjn49@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).