From: Alex Williamson <alex@shazbot.org>
To: Manivannan Sadhasivam <mani@kernel.org>
Cc: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>,
ath11k@lists.infradead.org, ath12k@lists.infradead.org,
bhelgaas@google.com, jjohnson@kernel.org,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
linux-wireless@vger.kernel.org, mhi@lists.linux.dev,
alex@shazbot.org
Subject: Re: [PATCH v9] PCI: Add device-specific reset for Qualcomm devices
Date: Mon, 22 Jun 2026 16:08:22 -0600 [thread overview]
Message-ID: <20260622160822.09350246@shazbot.org> (raw)
In-Reply-To: <4wmbans3ae5ayxqvs3wwn4hg3r3dcjuugmw2akoihvry35bq6k@k5lm6zjrp44l>
On Mon, 22 Jun 2026 18:22:39 +0200
Manivannan Sadhasivam <mani@kernel.org> wrote:
> On Thu, Jun 18, 2026 at 08:33:08AM +0200, Jose Ignacio Tornos Martinez wrote:
> > Hi Mani,
> >
> > Let me clarify the exact scenario and where the reset is necessary:
> >
> > * For the commented WiFi devices (WCN6855/WCN7850):
> >
> > Standard VFIO passthrough flow (this works fine):
> > 1. Unbind native driver (ath11k/ath12k/MHI)
> > 2. Bind vfio-pci driver
> > 3. Assign device to VM
> > 4. VM boots, loads its own driver → device works perfectly
> > 5. VM shuts down cleanly → device can be reassigned → works fine
> >
> > The problem occurs with unclean VM termination:
> > 1. VM crashes or is force-terminated
> > 2. VFIO tries to reset the device before reassignment
> > 3. Without a working PCI reset method, reset fails
> > 4. Device stuck in undefined state → cannot be reassigned to another VM
> >
> > Unbinding the driver again doesn't help because the device hardware
> > itself is in a bad state. From hypervisor:
> > $ lspci -vvv -s 0000:03:00.0
> > 03:00.0 Network controller: Qualcomm Technologies, Inc (rev ff) (prog-if ff)
> > !!! Unknown header type 7f
> > And a full host power-cycle is necessary to recover.
> >
>
> Can you try the global reset available in the WLAN device BAR space?
>
> WCN6855: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/tree/drivers/net/wireless/ath/ath11k/pci.c#n193
> WCN7850: https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/tree/drivers/net/wireless/ath/ath12k/pci.c#n182
>
> > * For the commented modem devices (SDX62/SDX65):
> >
> > Even worse because it fails during the first VM boot without proper reset
> > capability, standard VFIO passthrough flow:
> > 1. Unbind native driver (MHI)
> > 2. Bind vfio-pci driver
> > 3. Assign device to VM
> > 4. VM boots, loads its own driver and crashes:
> > [ 24.024165] mhi mhi0: Device failed to enter MHI Ready
> > [ 24.024168] mhi mhi0: MHI did not enter READY state
> >
> > Unbind/rebind attempts fail:
> > [ 352.643601] mhi mhi0: Requested to power ON
> > [ 352.643611] mhi mhi0: Power on setup success
> > [ 373.442954] mhi mhi0: Device failed to clear MHI Reset
> > [ 373.442970] mhi mhi0: MHI did not enter READY state
> > And requires a full host power cycle to recover,
> > even outside of VFIO scenarios.
> >
> > * MHI Host driver's remove callback may handle clean software state
> > teardown, but it doesn't provide a PCI reset capability that VFIO can
> > invoke. VFIO needs a reset method registered in the PCI reset hierarchy
> > (device_specific, pm, flr, bus, etc.). VFIO invokes this reset both during
> > initial device binding (before the VM starts) and when reassigning the
> > device between VMs - without a working reset method, the device cannot
> > reach a clean state for initialization.
> >
>
> Likewise, there is a SoC reset available in the modem BAR space. You can try it:
> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/tree/drivers/bus/mhi/host/main.c#n178
>
> If these works, then you can hook these in the device_specific reset callback.
These look promising as simple flows to implement in a device specific
reset: save command register, set memory enable, ioremap BAR space,
match read/write/delay sequences of reset function and caller, iounmap,
restore command.
Note the delay in this latter reset is in the caller. It's also
surprising that none of these implement a read to flush the posted
write that initiates the reset. I wonder if that contributes to the 2s
delay in the latter example.
Also it appears these reset the internal state, but not the PCI state,
which is fine for our purposes, and certainly more confidence inspiring
than the D3hot heuristics. Thanks,
Alex
next prev parent reply other threads:[~2026-06-22 22:08 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-12 14:26 [PATCH v9] PCI: Add device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
2026-06-12 14:41 ` sashiko-bot
2026-06-12 15:12 ` Alex Williamson
2026-06-12 15:17 ` Bjorn Helgaas
2026-06-15 7:30 ` Jose Ignacio Tornos Martinez
2026-06-17 14:47 ` Manivannan Sadhasivam
2026-06-17 15:47 ` Jose Ignacio Tornos Martinez
2026-06-17 16:55 ` Manivannan Sadhasivam
2026-06-18 6:33 ` Jose Ignacio Tornos Martinez
2026-06-22 16:22 ` Manivannan Sadhasivam
2026-06-22 22:08 ` Alex Williamson [this message]
2026-06-23 5:32 ` Manivannan Sadhasivam
2026-06-23 8:55 ` Baochen Qiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260622160822.09350246@shazbot.org \
--to=alex@shazbot.org \
--cc=ath11k@lists.infradead.org \
--cc=ath12k@lists.infradead.org \
--cc=bhelgaas@google.com \
--cc=jjohnson@kernel.org \
--cc=jtornosm@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=mani@kernel.org \
--cc=mhi@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox