[Bug 221103] xhci_hcd: System lockup under CPU load during usbfs polling of USB devices on AMD platforms

All of lore.kernel.org
 help / color / mirror / Atom feed

From: bugzilla-daemon@kernel.org
To: linux-usb@vger.kernel.org
Subject: [Bug 221103] xhci_hcd: System lockup under CPU load during usbfs polling of USB devices on AMD platforms
Date: Tue, 24 Feb 2026 07:45:03 +0000	[thread overview]
Message-ID: <bug-221103-208809-YI3kLaShqa@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-221103-208809@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=221103

--- Comment #19 from Paul Alesius (paul@unnservice.com) ---
(In reply to Michał Pecio from comment #18)
> Could you enable dynamic debug and check if simply toggling power/control
> between 'on' and 'auto' produces the same xhci_suspend/xhci_resume messages?
> Would this be enough to hang the system
Enabling dynamic debug and changing power/control on/auto rapidly produces the
same suspend/resume messages on all devices. Changing control= on and auto
rapidly on 0000:7a:00.4 does not trigger the freeze.

> What's the state of power/control for those HCs which aren't causing
> problems? Are they also getting resumed and suspended under your test, but
> without crashing? That would be at least one optimistic result in this whole
> mess :)
About half of them are on and auto, those with control=auto by default do not
trigger the freeze (except the known-bad 7a:00.4, and I've not stressed the
others as much until arriving at the conclusion that it's 7a:00.4 triggering
the freeze). Here's their default values and notes on them:

 control=on 0000:0e:00.0 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0
root hub
 control=on 0000:0e:00.0 Bus 001 Device 002: ID 13d3:3588 IMC Networks
Wireless_Device (Internal)
 control=on 0000:0e:00.0 Bus 001 Device 003: ID 0b05:19af ASUSTek Computer,
Inc. AURA LED Controller (Internal)
 control=on 0000:0e:00.0 Bus 001 Device 004: ID 046d:c548 Logitech, Inc. Logi
Bolt Receiver (Plugged in)
 control=on 0000:0e:00.0 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0
root hub
 control=on 0000:10:00.0 Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0
root hub
 control=on 0000:10:00.0 Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0
root hub
The 78:00.0 have xhci_pci_suspend -110 errors during boot:
[   17.918387] xhci_hcd 0000:78:00.0: WARN: xHC CMD_RUN timeout
[   17.918508] xhci_hcd 0000:78:00.0: PM: suspend_common(): xhci_pci_suspend
returns -110
[   17.918586] xhci_hcd 0000:78:00.0: can't suspend (hcd_pci_runtime_suspend
returned -110)
 control=auto 0000:78:00.0 Bus 005 Device 001: ID 1d6b:0002 Linux Foundation
2.0 root hub
 control=auto 0000:78:00.0 Bus 006 Device 001: ID 1d6b:0003 Linux Foundation
3.0 root hub
 control=auto 0000:7a:00.3 Bus 007 Device 001: ID 1d6b:0002 Linux Foundation
2.0 root hub
 control=auto 0000:7a:00.3 Bus 008 Device 001: ID 1d6b:0003 Linux Foundation
3.0 root hub
 control=auto 0000:7a:00.4 Bus 009 Device 001: ID 1d6b:0002 Linux Foundation
2.0 root hub
This is the root hub that freeze during rapid polling, same PCI ID as the line
above that is unaffected:
 control=auto 0000:7a:00.4 Bus 010 Device 001: ID 1d6b:0003 Linux Foundation
3.0 root hub
 control=auto 0000:7b:00.0 Bus 011 Device 001: ID 1d6b:0002 Linux Foundation
2.0 root hub

I then enabled full dynamic debug + netconsole (printk=8):
 $ echo 'module xhci_hcd +p' | sudo tee /proc/dynamic_debug/control
 $ echo 'module usbcore +p' | sudo tee /proc/dynamic_debug/control
 $ echo 'module pci +p' | sudo tee /proc/dynamic_debug/control
 $ echo 8 | sudo tee /proc/sys/kernel/printk

Surprisingly, the system did not freeze for over 20 minutes with 3 instances
polling simultaneously and stress-ng --cpu 0. The moment I killed stress-ng
first by coincidence, the system froze immediately. Netconsole captured this up
until the lockup:
...
[ 1766.915244] xhci_hcd 0000:7a:00.4: PME# disabled
[ 1766.915262] xhci_hcd 0000:7a:00.4: enabling bus mastering
... (normal suspend/resume cycle) ...
[ 1767.170769] xhci_hcd 0000:7a:00.4: PME# disabled
[ 1767.170774] xhci_hcd 0000:7a:00.4: enabling bus mastering
[ 1767.181194] xhci_hcd 0000:7a:00.4: Controller not ready at resume -19
[ 1767.181209] xhci_hcd 0000:7a:00.4: PCI post-resume error -19!
[ 1767.181213] xhci_hcd 0000:7a:00.4: HC died; cleaning up
[ 1767.181222] xhci_hcd 0000:7a:00.4: hcd_pci_runtime_resume: -19
[ 1767.181232] hub 9-0:1.0: state 0 ports 2 chg 0000 evt 0000
[ 1767.181238] hub 10-0:1.0: state 0 ports 2 chg 0000 evt 0000

> There is another bug 221073 about some AMD HCs dying on resume from system
> sleep,
> may be related. So far nobody knows why it happens.
I don't know enough to say whether they are the same root cause, but both
involve an AMD xHC dying on resume, so they may be related.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

next prev parent reply	other threads:[~2026-02-24  7:45 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-18 14:52 [Bug 221103] New: xhci_hcd: System lockup under CPU load during rapid usbfs polling of SuperSpeed root hubs on AMD Ryzen platforms bugzilla-daemon
2026-02-20  7:30 ` [Bug 221103] xhci_hcd: System lockup under CPU load during usbfs polling of USB devices on AMD platforms bugzilla-daemon
2026-02-20  8:31 ` bugzilla-daemon
2026-02-20  9:17   ` Greg KH
2026-02-20  9:16 ` bugzilla-daemon
2026-02-20  9:17 ` bugzilla-daemon
2026-02-20  9:24 ` bugzilla-daemon
2026-02-20  9:26 ` bugzilla-daemon
2026-02-20  9:28 ` bugzilla-daemon
2026-02-20  9:40 ` bugzilla-daemon
2026-02-20 10:07 ` bugzilla-daemon
2026-02-20 10:17   ` Greg KH
2026-02-20 10:17 ` bugzilla-daemon
2026-02-20 10:21 ` bugzilla-daemon
2026-02-20 11:19 ` bugzilla-daemon
2026-02-20 14:07 ` bugzilla-daemon
2026-02-20 17:18 ` bugzilla-daemon
2026-02-21  1:12 ` bugzilla-daemon
2026-02-23 13:05 ` bugzilla-daemon
2026-02-23 17:52 ` bugzilla-daemon
2026-02-23 22:33 ` bugzilla-daemon
2026-02-24  7:45 ` bugzilla-daemon [this message]
2026-02-24  8:52 ` bugzilla-daemon
2026-02-24 10:19 ` bugzilla-daemon
2026-02-24 12:03 ` bugzilla-daemon
2026-02-24 12:21 ` bugzilla-daemon
2026-02-24 15:42 ` bugzilla-daemon
2026-03-08 17:56 ` bugzilla-daemon
2026-07-09 11:58 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-221103-208809-YI3kLaShqa@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@kernel.org \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.