From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: Michal Pecio <michal.pecio@gmail.com>,
Martin Alderson <martinalderson@gmail.com>
Cc: linux-usb@vger.kernel.org
Subject: Re: xhci_hcd: AMD Raphael/Granite Ridge USB 2.0 xHCI [1022:15b8] dies on resume from suspend
Date: Tue, 12 May 2026 17:01:37 +0300 [thread overview]
Message-ID: <fc2d9862-6c46-4161-8fd5-68b9e6c2e8bb@linux.intel.com> (raw)
In-Reply-To: <20260512120334.4eef3d0b.michal.pecio@gmail.com>
On 5/12/26 13:03, Michal Pecio wrote:
> On Sun, 10 May 2026 17:29:26 +0100, Martin Alderson wrote:
>> 1. The timing is during suspend in every single failure I have logs
>> for. I went back through 7 weeks of persistent journals and pulled the
>> context around every "HC died" event. All 9 failures show the same
>> sequence:
>>
>> xhci_hcd 0000:0f:00.0: xHCI host not responding to stop endpoint command
>> xhci_hcd 0000:0f:00.0: xHCI host controller not responding, assume dead
>> xhci_hcd 0000:0f:00.0: HC died; cleaning up
>> PM: suspend devices took 5.5--6.1 seconds <-- elevated
>> amdgpu 0000:03:00.0: MODE1 reset
>> ACPI: PM: Preparing to enter system sleep state S3
>>
>> So it's reliably during suspend, before S3 entry, and the elevated
>> "suspend devices took" matches the 5s xHCI stop-endpoint timeout. A
>> clean suspend on the same boot takes ~0.46s.
>
> The S3 state probably doesn't matter, chances are that it would also
> happen with s2idle or hibernation.
>
> Could you enable dynamic debug before every suspend (or permanently
> on every boot) and collect a dmesg log of this happening again?
> And maybe also a snapshot of debugfs directory after resume but before
> unbinding xhci_hcd. These may contain clues what triggered it.
It's possible there is a race between queuing a command and suspend.
It looks like nothing is preventing a new command from being queued while
suspend stops the host from running, thus causing commands to timeout.
Suspend isn't checking if there are pending commands, or if command timer
is running either.
I wrote some debugging code, can be found in my debug_hc_died_cmdring_race branch:
git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git debug_hc_died_cmdring_race
https://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git/log/?h=debug_hc_died_cmdring_race
If it prints
"Can't queue command, xHC not accessible (stopped?)"
or
"Suspending and stopping xHC with pending command(s)!!!"
Then we have a queue_command - suspend race.
Code below for reference
Mathias
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index e47e644b296e..50ce4a4a7fe3 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -4353,6 +4353,7 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd,
u32 field3, u32 field4, bool command_must_succeed)
{
int reserved_trbs = xhci->cmd_ring_reserved_trbs;
+ struct usb_hcd *hcd = xhci_to_hcd(xhci);
int ret;
if ((xhci->xhc_state & XHCI_STATE_DYING) ||
@@ -4362,6 +4363,14 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd,
return -ESHUTDOWN;
}
+ if (!HCD_HW_ACCESSIBLE(hcd)) {
+ xhci_err(xhci, "Can't queue command, xHC not accessible (stopped?)\n");
+ xhci_err(xhci, "called by %pS from %pS\n",
+ __builtin_return_address(0),
+ __builtin_return_address(1));
+ return -ESHUTDOWN;
+ }
+
if (!command_must_succeed)
reserved_trbs++;
diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index a54f5b57f205..04279fbbe1dd 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -949,6 +949,34 @@ static bool xhci_pending_portevent(struct xhci_hcd *xhci)
return false;
}
+static void xhci_dump_ring(struct xhci_hcd *xhci, struct xhci_ring *ring)
+{
+ struct xhci_segment *seg;
+ union xhci_trb *trb;
+ dma_addr_t dma;
+ char str[XHCI_MSG_MAX];
+ int i, j;
+
+ seg = ring->first_seg;
+ dma = xhci_trb_virt_to_dma(ring->deq_seg, ring->dequeue);
+
+ xhci_err(xhci, "Dequeue: %pad\n", &dma);
+
+ for (i = 0; i < ring->num_segs; i++) {
+ for (j = 0; j < TRBS_PER_SEGMENT; j++) {
+ trb = &seg->trbs[j];
+ dma = seg->dma + j * sizeof(*trb);
+ xhci_err(xhci, "%pad: %s\n", &dma,
+ xhci_decode_trb(str, XHCI_MSG_MAX,
+ le32_to_cpu(trb->generic.field[0]),
+ le32_to_cpu(trb->generic.field[1]),
+ le32_to_cpu(trb->generic.field[2]),
+ le32_to_cpu(trb->generic.field[3])));
+ }
+ seg = seg->next;
+ }
+}
+
/*
* Stop HC (not bus-specific)
*
@@ -999,6 +1027,12 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
/* step 1: stop endpoint */
/* skipped assuming that port suspend has done */
+ /* Check if command ring is empty */
+ if (!list_empty(&xhci->cmd_list)) {
+ xhci_err(xhci, "Suspending and stopping xHC with pending command(s)!!!\n");
+ xhci_dump_ring(xhci, xhci->cmd_ring);
+ }
+
/* step 2: clear Run/Stop bit */
command = readl(&xhci->op_regs->command);
command &= ~CMD_RUN;
prev parent reply other threads:[~2026-05-12 14:01 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-29 21:52 xhci_hcd: AMD Raphael/Granite Ridge USB 2.0 xHCI [1022:15b8] dies on resume from suspend martinalderson
2026-03-30 0:07 ` Michal Pecio
2026-04-04 12:04 ` Martin Alderson
2026-04-04 13:24 ` Michal Pecio
2026-05-09 14:51 ` Martin Alderson
2026-05-09 16:06 ` Michal Pecio
2026-05-10 16:29 ` Martin Alderson
2026-05-12 10:03 ` Michal Pecio
2026-05-12 14:01 ` Mathias Nyman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fc2d9862-6c46-4161-8fd5-68b9e6c2e8bb@linux.intel.com \
--to=mathias.nyman@linux.intel.com \
--cc=linux-usb@vger.kernel.org \
--cc=martinalderson@gmail.com \
--cc=michal.pecio@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox