From: "Tu Dinh" <ngoc-tu.dinh@vates.tech>
To: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>,
xen-devel <xen-devel@lists.xenproject.org>
Cc: "Andrew Cooper" <andrew.cooper3@citrix.com>,
"Stefano Stabellini" <sstabellini@kernel.org>
Subject: Re: PCI passthrough of XHCI on Framework AMD crashes the host
Date: Wed, 23 Jul 2025 12:55:54 +0000 [thread overview]
Message-ID: <f2d125f2-febe-4e92-a7cf-5373b069cd1c@vates.tech> (raw)
In-Reply-To: <aIDXIqA4L7wcJH2T@mail-itl>
On 23/07/2025 14:35, Marek Marczykowski-Górecki wrote:
> Hi,
>
> There is yet another issue affecting Framework AMD... When I start a
> domU with XHCI controller attached (PCI passthrough), the whole host
> resets if there was an USB device plugged into it. I don't get any panic
> message (neither on XHCI console - which is connected to a different
> XHCI controller, nor on VGA), and the reboot reason register shows
> 0x08000800 ("an uncorrected error caused a data fabric sync flood
> event") according to [1].
>
> This is Framework AMD with AMD Ryzen 5 7640U.
>
> The crash itself happens quite early on domU startup - specifically when
> SeaBIOS tries to initialize XHCI. I tracked it down to the second
> readl() in xhci_controller_setup() [2]. Interestingly, it's specifically
> the second readl(), regardless of which of those comes first. I tried
> swapping their order, or even repeating read from the same register -
> always the second call triggers the crash. The first one succeeds and
> returns some value (for example 0x1200020 for HCCPARAMS).
>
> If I start the domU when no USB devices are connected, it doesn't crash.
>
> If I manually unbind the device from the dom0 driver (echo 0000:c3:00.4 >
> /sys/bus/pci/drivers/xhci_hcd/unbind), it doesn't crash. Note I have
> seize=1 in domU config, so the `xl pci-assignable-add` calls is implicit.
>
> If the system doesn't crash (either by not having any USB devices
> connected initially, or by the manual unbind), the USB controller in
> domU works fine. I can later connect devices and they appear inside
> domU.
>
> This system has a couple of XHCI controllers, and the same behavior is
> observed on at least two of them.
>
> The controller works just fine when used in dom0.
>
> If I passthrough another PCI device instead (tried wifi card and audio
> card), it doesn't crash.
>
> The value read from from HCCPARAMS (BAR + 0x10) differs between good and bad case:
> - 0x01200020 when it crashes
> - 0x0110ffc5 when it works
>
> It's weird to have this much differences here, given most bits in this
> register is about device capabilities[3], not its runtime state...
>
> In this system my main debugging tool is the XHCI console. But I tried
> also without enabling XHCI console, and it still crashes, so it looks
> like it isn't caused by the XHCI console.
>
> I tried also disabling XHCI initialization in SeaBIOS, and then it
> proceeds to booting domU's kernel. But as soon as Linux gets into
> initializing that USB controller, it crashes the same way. So, it isn't
> just SeaBIOS doing something weird (or at least not just that).
>
> With PVH dom0, the behavior is a bit different:
> 1. Initially, the controller works fine in dom0.
> 2. When starting domU, instead of clean unbind this happens:
>
> [ 11.248760] xhci_hcd 0000:c3:00.4: Controller not ready at resume -19
> [ 11.248765] xhci_hcd 0000:c3:00.4: PCI post-resume error -19!
> [ 11.248767] xhci_hcd 0000:c3:00.4: HC died; cleaning up
> [ 11.249010] xhci_hcd 0000:c3:00.4: remove, state 4
> [ 11.249013] usb usb8: USB disconnect, device number 1
> [ 11.249437] xhci_hcd 0000:c3:00.4: USB bus 8 deregistered
> [ 11.249832] xhci_hcd 0000:c3:00.4: remove, state 4
> [ 11.249835] usb usb7: USB disconnect, device number 1
> [ 11.250074] xhci_hcd 0000:c3:00.4: Host halt failed, -19
> [ 11.250076] xhci_hcd 0000:c3:00.4: Host not accessible, reset failed.
> [ 11.250389] xhci_hcd 0000:c3:00.4: USB bus 7 deregistered
> [ 11.251011] pciback 0000:c3:00.4: xen_pciback: seizing device
> [ 11.335120] pciback 0000:c3:00.4: xen_pciback: vpci: assign to virtual slot 0
> [ 11.335544] pciback 0000:c3:00.4: registering for 1
>
> 3. Reading from BAR in domU (in SeaBIOS, and later Linux) returns
> 0xffffffff.
> 4. Does not crash the host.
>
> Any ideas?
>
> I don't have any other system with Zen4 to try on. The hw11 gitlab
> runner is Ryzen 7 7735HS, and it doesn't have this issue. It's also
> possible this is something related to Framework's firmware, but give all
> the observations above, I find it less likely.
>
> [1] https://docs.kernel.org/arch/x86/amd-debugging.html#random-reboot-issues
> [2] https://github.com/coreboot/seabios/blob/master/src/hw/usb-xhci.c#L553
> [3] https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/extensible-host-controler-interface-usb-xhci.pdf (page 385)
I had a similar problem with a Beelink mini PC with the Ryzen 5800U
after a recent Qubes upgrade.
If the USB controller is passed through to sys-usb then the system
simply resets without warning.
Ngoc Tu Dinh | Vates XCP-ng Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech
next prev parent reply other threads:[~2025-07-23 12:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-23 12:35 PCI passthrough of XHCI on Framework AMD crashes the host Marek Marczykowski-Górecki
2025-07-23 12:55 ` Tu Dinh [this message]
2025-07-23 13:10 ` Marek Marczykowski-Górecki
2025-07-23 13:13 ` Tu Dinh
2025-07-23 13:17 ` Andrew Cooper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f2d125f2-febe-4e92-a7cf-5373b069cd1c@vates.tech \
--to=ngoc-tu.dinh@vates.tech \
--cc=andrew.cooper3@citrix.com \
--cc=marmarek@invisiblethingslab.com \
--cc=sstabellini@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.