* [Bug 219532] Crash in RIP: 0010:xhci_handle_stopped_cmd_ring
2024-11-26 1:53 [Bug 219532] New: Crash in RIP: 0010:xhci_handle_stopped_cmd_ring bugzilla-daemon
@ 2024-11-26 1:55 ` bugzilla-daemon
2024-11-26 1:56 ` bugzilla-daemon
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2024-11-26 1:55 UTC (permalink / raw)
To: linux-usb
https://bugzilla.kernel.org/show_bug.cgi?id=219532
--- Comment #1 from James.Dutton@gmail.com ---
Created attachment 307280
--> https://bugzilla.kernel.org/attachment.cgi?id=307280&action=edit
xhci-ring.c
The kernel source file where the crash happens.
xhci-ring.c:423
/* ring command ring doorbell to restart the command ring */
if ((xhci->cmd_ring->dequeue != xhci->cmd_ring->enqueue) &&
!(xhci->xhc_state & XHCI_STATE_DYING)) {
xhci->current_cmd = cur_cmd; <- Crashes here. NULL pointer
dereference.
xhci_mod_cmd_timer(xhci);
xhci_ring_cmd_db(xhci);
}
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread* [Bug 219532] Crash in RIP: 0010:xhci_handle_stopped_cmd_ring
2024-11-26 1:53 [Bug 219532] New: Crash in RIP: 0010:xhci_handle_stopped_cmd_ring bugzilla-daemon
2024-11-26 1:55 ` [Bug 219532] " bugzilla-daemon
@ 2024-11-26 1:56 ` bugzilla-daemon
2024-11-26 2:00 ` bugzilla-daemon
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2024-11-26 1:56 UTC (permalink / raw)
To: linux-usb
https://bugzilla.kernel.org/show_bug.cgi?id=219532
--- Comment #2 from James.Dutton@gmail.com ---
Created attachment 307281
--> https://bugzilla.kernel.org/attachment.cgi?id=307281&action=edit
The compiled xhci-ring.o file
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread* [Bug 219532] Crash in RIP: 0010:xhci_handle_stopped_cmd_ring
2024-11-26 1:53 [Bug 219532] New: Crash in RIP: 0010:xhci_handle_stopped_cmd_ring bugzilla-daemon
2024-11-26 1:55 ` [Bug 219532] " bugzilla-daemon
2024-11-26 1:56 ` bugzilla-daemon
@ 2024-11-26 2:00 ` bugzilla-daemon
2024-11-26 2:05 ` bugzilla-daemon
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2024-11-26 2:00 UTC (permalink / raw)
To: linux-usb
https://bugzilla.kernel.org/show_bug.cgi?id=219532
--- Comment #3 from James.Dutton@gmail.com ---
I think xhci variable must be NULL entering the xhci_handle_stopped_cmd_ring()
One could put a null check at the top of the function, but I don't know what
one should do in that case.
Its a void function, so no error value can be returned.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread* [Bug 219532] Crash in RIP: 0010:xhci_handle_stopped_cmd_ring
2024-11-26 1:53 [Bug 219532] New: Crash in RIP: 0010:xhci_handle_stopped_cmd_ring bugzilla-daemon
` (2 preceding siblings ...)
2024-11-26 2:00 ` bugzilla-daemon
@ 2024-11-26 2:05 ` bugzilla-daemon
2024-11-26 23:09 ` bugzilla-daemon
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2024-11-26 2:05 UTC (permalink / raw)
To: linux-usb
https://bugzilla.kernel.org/show_bug.cgi?id=219532
--- Comment #4 from James.Dutton@gmail.com ---
c1:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 15b9 (prog-if
30 [XHCI])
Subsystem: Framework Computer Inc. Device 0005
Flags: bus master, fast devsel, latency 0, IRQ 45, IOMMU group 17
Memory at 90200000 (64-bit, non-prefetchable) [size=1M]
Capabilities: <access denied>
Kernel driver in use: xhci_hcd
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread* [Bug 219532] Crash in RIP: 0010:xhci_handle_stopped_cmd_ring
2024-11-26 1:53 [Bug 219532] New: Crash in RIP: 0010:xhci_handle_stopped_cmd_ring bugzilla-daemon
` (3 preceding siblings ...)
2024-11-26 2:05 ` bugzilla-daemon
@ 2024-11-26 23:09 ` bugzilla-daemon
2024-11-28 0:19 ` bugzilla-daemon
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2024-11-26 23:09 UTC (permalink / raw)
To: linux-usb
https://bugzilla.kernel.org/show_bug.cgi?id=219532
Michał Pecio (michal.pecio@gmail.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |michal.pecio@gmail.com
--- Comment #5 from Michał Pecio (michal.pecio@gmail.com) ---
It looks like some device doesn't respond to address assignment after
connection. If you weren't connecting anything at the time, it's possible that
a device is buggy and had disconnected by itself a moment earlier, but the log
is too short to tell.
Not sure why it crashed. It looks like there were two attempts 6 seconds apart
and no crash on the first attempt.
It's unlikely that xhci was NULL near the end of the function. If it were,
there were several opportunities to crash earlier. The call to
xhci_mod_cmd_timer() in the next line dereferences xhci->cur_cmd, which is
perhaps more suspicious.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread* [Bug 219532] Crash in RIP: 0010:xhci_handle_stopped_cmd_ring
2024-11-26 1:53 [Bug 219532] New: Crash in RIP: 0010:xhci_handle_stopped_cmd_ring bugzilla-daemon
` (4 preceding siblings ...)
2024-11-26 23:09 ` bugzilla-daemon
@ 2024-11-28 0:19 ` bugzilla-daemon
2024-12-01 22:06 ` bugzilla-daemon
2024-12-01 22:07 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2024-11-28 0:19 UTC (permalink / raw)
To: linux-usb
https://bugzilla.kernel.org/show_bug.cgi?id=219532
--- Comment #6 from James.Dutton@gmail.com ---
It might be dealing with a buggy device. My question is how should one do error
recovery here when xhci is null but the function has no error return value as
its a void function.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread* [Bug 219532] Crash in RIP: 0010:xhci_handle_stopped_cmd_ring
2024-11-26 1:53 [Bug 219532] New: Crash in RIP: 0010:xhci_handle_stopped_cmd_ring bugzilla-daemon
` (5 preceding siblings ...)
2024-11-28 0:19 ` bugzilla-daemon
@ 2024-12-01 22:06 ` bugzilla-daemon
2024-12-01 22:07 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2024-12-01 22:06 UTC (permalink / raw)
To: linux-usb
https://bugzilla.kernel.org/show_bug.cgi?id=219532
--- Comment #7 from Michał Pecio (michal.pecio@gmail.com) ---
No sensible way to handle it and it should never happen. All we could do is
print an error and return immediately, but in any such case the xHCI driver is
likely already FUBAR anyway.
I *hope* that you are mistaken and your crash was caused by dereferencing
xhci->current_cmd in the next line, due to cur_cmd being NULL. This is not
supposed to happen either, because the check for (xhci->cmd_ring->dequeue !=
xhci->cmd_ring->enqueue) is there exactly to catch cases when no commands are
pending and cur_cmd is expected to be NULL.
But it doesn't work for one in 255 commands, namely when the aborted command
was the last one in its ring segment. Then enqueue points at the subsequent
link TRB, while dequeue is already in the next segment. Until recently, such
command abort would have failed due to a different bug (and caused different
problems), but that other bug has just been fixed and it looks like we may
start seeing those NULL dereferences now.
This patch should keep your system from crashing *if* this is the problem that
you are running into. The driver should print "cur_cmd bug detected, 0 fff" and
continue working normally. (Which means, keep printing more of those "setup
device timed out".)
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread* [Bug 219532] Crash in RIP: 0010:xhci_handle_stopped_cmd_ring
2024-11-26 1:53 [Bug 219532] New: Crash in RIP: 0010:xhci_handle_stopped_cmd_ring bugzilla-daemon
` (6 preceding siblings ...)
2024-12-01 22:06 ` bugzilla-daemon
@ 2024-12-01 22:07 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2024-12-01 22:07 UTC (permalink / raw)
To: linux-usb
https://bugzilla.kernel.org/show_bug.cgi?id=219532
--- Comment #8 from Michał Pecio (michal.pecio@gmail.com) ---
Created attachment 307305
--> https://bugzilla.kernel.org/attachment.cgi?id=307305&action=edit
try to fix the bug and gather confirmation
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread