public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@kernel.org
To: linux-usb@vger.kernel.org
Subject: [Bug 221073] xHCI host controller dies on resume from s2idle on AMD Strix Halo [1022:1587]
Date: Wed, 11 Mar 2026 22:09:12 +0000	[thread overview]
Message-ID: <bug-221073-208809-5P3JKioI2T@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-221073-208809@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=221073

--- Comment #35 from Alexander F (superveridical@gmail.com) ---
Created attachment 309621
  --> https://bugzilla.kernel.org/attachment.cgi?id=309621&action=edit
Z13 klogs with reg dump and MSI-only patches

Sorry for the delay.

I wanted to be succinct but the circumstances make it impossible. My main OS
uses zfs root which taints the kernel, so I've been reporting from a copy of
the system modified to run from a 32gb usb stick to avoid taint. (Also to
isolate all the testing modifications) So far all the prior reports I used
6.19.3 kernel and 20260110 firmware. To make my reports more relevant I decided
to update the version of the kernel to the latest stable release of 6.19.6 and
20260221 firmware. But that caused an unforeseen issue of the system not being
able to reach the proper sleep state:

    Mar 10 19:14:57 rescue-flow kernel: amd_pmc AMDI000B:00: Last suspend
didn't reach deepest state 

I isolated the issue to the firmware upgrade, downgraded the firmware to
20260110 version, and it became like it was before. I don't know whether it's a
bug, firmware-kernel version compatibility discrepancy, potential brokenness of
my device manifesting itself, or something else, but I have to mention that I
used `make localmodconfig` from a 6.19.3 system in order to avoid long build
times (due to building all the modules) on the slow usb stick, and on the
laptop in general. 

What's weird/interesting is that when I tried to make sure that the issue is
reproducible after updates and encountered "didn't reach deepest state", I
still managed to trigger the "HC died" issue -- I did it 2 times and both times
it took around 30 tries, which is much less frequent than the usual state,
where I need 3-7 tries. (klog-deepest-state file)

So I returned to vanilla-sources-6.19.6 manually built kernel and 20260110
firmware, patched the kernel with the register dump patch. Interestingly it was
a little harder to trigger than before, but I didn't do enough runs to say
definitively. The files are klog-pecio-patch and klog-pecio-patch2.

I then applied the patch that makes the HC use the MSI interrupt(had to
manually erase it since the patch wasn't working with that version of the
kernel). I provided lspci files of the effect. I was not able to reproduce the
issue with that patch. I had to automate the suspend/resume cycles with `while
true; do sleep 5; rtcwake -m freeze -s 7; if dmesg | grep -q "HC died"; then
break; fi; done;` I did about 70 cycles. I'll do at least 200 later to confirm.

Since there are power issues involved I also provided 3 reports from amd-s2idle
tool. Two of them are from the newer and the older firmwares on the live usb,
and one of them is from the main zfs system with the 6.18.10 kernel, which I
included to show the following log entries in the 46th cycle:

ACPI: \_SB_.PCI0.GPP3: LPI: Constraint not met; min power state:D1 current
power state:D0
ACPI: \_SB_.PCI0.GPP6: LPI: Constraint not met; min power state:D3hot current
power state:D0
ACPI: \_SB_.PCI0.GP10: LPI: Device not power manageable
ACPI: \_SB_.PCI0.GPP0.SWUS: LPI: Constraint not met; min power state:D3hot
current power state:D0
ACPI: \_SB_.PCI0.GPP1.SWUS: LPI: Constraint not met; min power state:D3hot
current power state:D0
ACPI: \_SB_.PCI0.GPP5.WLAN: LPI: Device not power manageable

which for some reason (missing modules due to localmodconfig, misconfiguration,
newer and vanilla kernel, don't know) I wasn't able to make the tool produce on
liveusb, neither in the test nor in the report mode. But the output above was
produced for a what I think is regular working suspend, and I think also
without any potentially broken state from "HC died". I don't remember if it was
with 0x40 quirk or not. Not sure if it's nominal sleep discharge rate -- it's a
bit high. I can do more digging if any of that is important. 

>There are some error flags set on DevSta 

These flags only appear after the "HC died" occurs. (That event also adds
(warning) taint.) I verified that by running `lspci -vvv | grep DevSta: | grep
+` before/after every resume, and NonFatalError on all c4:00 devices flips only
after the event.

Files: 
klog-deepest-state -- kernel log of of an attempt to trigger the issue with the
newer firmware, didn't enable debug output for that
lspci-right-after-boot -- lspci for older firmware right after boot
lspci-0221-firmware-after-boot -- lspci for newer firmware 
klog-pecio-patch, klog-pecio-patch2 - register dump patched debug output, use
any of the two
lspci-msi-patched-right-after-boot -- shows that MSI, not MSI-X interrupts are
enabled
klog-msi-patched - kernel log with the two patches. I had to trim it, since the
issue wasn't triggered.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

  parent reply	other threads:[~2026-03-11 22:09 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-10 17:46 [Bug 221073] New: xHCI host controller dies on resume from s2idle on AMD Strix Halo [1022:1587] bugzilla-daemon
2026-02-10 18:04 ` [Bug 221073] " bugzilla-daemon
2026-02-11  6:54 ` bugzilla-daemon
2026-02-11 23:04 ` bugzilla-daemon
2026-02-12  8:27 ` bugzilla-daemon
2026-02-12 10:02 ` bugzilla-daemon
2026-02-12 16:15 ` bugzilla-daemon
2026-02-25 11:10 ` bugzilla-daemon
2026-02-26  8:48 ` bugzilla-daemon
2026-02-26  8:50 ` bugzilla-daemon
2026-02-26  9:30 ` bugzilla-daemon
2026-02-26  9:37 ` bugzilla-daemon
2026-02-26 12:16 ` bugzilla-daemon
2026-02-26 12:18 ` bugzilla-daemon
2026-02-26 22:51 ` bugzilla-daemon
2026-02-27 14:04 ` bugzilla-daemon
2026-03-02 16:45 ` bugzilla-daemon
2026-03-02 18:08 ` bugzilla-daemon
2026-03-02 18:14 ` bugzilla-daemon
2026-03-02 19:05 ` bugzilla-daemon
2026-03-03 14:54 ` bugzilla-daemon
2026-03-03 14:55 ` bugzilla-daemon
2026-03-03 14:55 ` bugzilla-daemon
2026-03-03 14:56 ` bugzilla-daemon
2026-03-03 15:05 ` bugzilla-daemon
2026-03-03 15:47 ` bugzilla-daemon
2026-03-03 15:51 ` bugzilla-daemon
2026-03-03 16:59 ` bugzilla-daemon
2026-03-03 17:05 ` bugzilla-daemon
2026-03-03 22:57 ` bugzilla-daemon
2026-03-04  0:20 ` bugzilla-daemon
2026-03-04  9:15 ` bugzilla-daemon
2026-03-06 11:11 ` bugzilla-daemon
2026-03-06 11:40 ` bugzilla-daemon
2026-03-09 10:31 ` bugzilla-daemon
2026-03-11 22:09 ` bugzilla-daemon [this message]
2026-03-12  0:04 ` bugzilla-daemon
2026-03-12  6:49 ` bugzilla-daemon
2026-03-12 10:35 ` bugzilla-daemon
2026-03-14  4:29 ` bugzilla-daemon
2026-03-16  0:39 ` bugzilla-daemon
2026-03-17  0:03 ` bugzilla-daemon
2026-03-18 23:18 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-221073-208809-5P3JKioI2T@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox