Linux wireless drivers development
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>,
	Alex Williamson <alex@shazbot.org>
Cc: bhelgaas@google.com, jjohnson@kernel.org, mani@kernel.org,
	linux-pci@vger.kernel.org, linux-wireless@vger.kernel.org,
	ath11k@lists.infradead.org, ath12k@lists.infradead.org,
	mhi@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9] PCI: Add device-specific reset for Qualcomm devices
Date: Fri, 12 Jun 2026 10:17:49 -0500	[thread overview]
Message-ID: <20260612151749.GA603817@bhelgaas> (raw)
In-Reply-To: <20260612142638.1243895-1-jtornosm@redhat.com>

[+to: Alex, VFIO resets]

On Fri, Jun 12, 2026 at 04:26:38PM +0200, Jose Ignacio Tornos Martinez wrote:
> Some Qualcomm PCIe devices (WCN6855/WCN7850 WiFi cards, SDX62/SDX65 modems)
> lack working reset methods for VFIO passthrough scenarios. These devices
> have no FLR capability, advertise NoSoftRst+ (blocking PM reset), and have
> broken bus reset.

I guess "bus reset" here refers to Secondary Bus Reset being asserted
by the bridge upstream from these devices?  Seems a bit surprising if
that doesn't work.  Or is it just that we can't use SBR because there
are multiple devices below that bridge?

> The problem manifests in VFIO passthrough scenarios:
> 
> - WCN6855 WiFi card (17cb:1103): Normal VM operation works fine, including
>   clean shutdown/reboot. However, when the VM terminates uncleanly
>   (crash, force-off), VFIO attempts to reset the device before it can
>   be assigned to another VM. Without a working reset method, the device
>   remains in an undefined state, preventing reuse.

I don't know enough about VFIO, but I sort of expected that VFIO would
reset devices between reassignment regardless of how a VM terminates.
I guess that's not true?

> - WCN7850 WiFi card (17cb:1107): Same behavior as WCN6855.
> 
> - SDX62/SDX65 5G modems (17cb:0308): Never successfully initialize even
>   on first VM assignment without proper reset capability.
> 
> Add device-specific reset entries for these Qualcomm devices using D3hot
> power cycling. Testing shows that despite advertising NoSoftRst+, D3hot
> transition provides sufficient reset for VFIO reuse, particularly after
> unexpected VM termination. While not a complete reset (BARs preserved),
> it provides the only viable reset mechanism for these devices.

Since the device claims to preserve internal state across D3hot->D0
(and it sounds like at least BARs *are* preserved), is this a
potential leak of state between VMs?  To play devil's advocate, how do
we convince a customer that none of their data is ever leaked to a
subsequent tenant using this device?

> Testing was performed on desktop platforms with M.2 WiFi and modem cards
> using M.2-to-PCIe adapters, including extensive force-reset cycling to
> verify stability.
> 
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> ---
> v9:
>   - Complete redesign based on maintainer feedback (Alex Williamson, Bjorn
>     Helgaas, Rafael Wysocki): dropped general d3cold infrastructure entirely
>     and now just a single patch: the proven D3hot reset for specific
>     Qualcomm devices (device-specific reset)
>   - Previous v8 patch 1/3 (general d3cold) dropped: concerns about ACPI
>     portability, bridge issues, runtime PM, and lack of _PR3 hardware for
>     testing.
>   - Previous v8 patch 3/3 (quirk_no_bus_reset) already merged for v7.2
> v8: https://lore.kernel.org/all/20260609163649.319755-1-jtornosm@redhat.com/
> 
>  drivers/pci/quirks.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 431c021d7414..bac1edb6c2dc 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4240,6 +4240,41 @@ static int reset_hinic_vf_dev(struct pci_dev *pdev, bool probe)
>  	return 0;
>  }
>  
> +/*
> + * Device-specific reset method for certain Qualcomm devices via D3hot power
> + * cycle.
> + *
> + * These specific Qualcomm devices lack FLR capability, advertise NoSoftRst+
> + * (blocking PM reset), and have broken bus reset. Despite advertising
> + * NoSoftRst+, testing shows that D3hot transition provides sufficient reset
> + * for VFIO reuse, particularly after unexpected VM termination where the
> + * device would otherwise remain in an undefined state. While not a complete
> + * reset (BARs are preserved), it provides the only viable reset mechanism for
> + * these devices in the commented situations.
> + */
> +static int reset_qualcomm_d3hot(struct pci_dev *dev, bool probe)
> +{
> +	int ret;
> +
> +	if (probe)
> +		return 0;
> +
> +	if (dev->current_state != PCI_D0)
> +		return -EINVAL;
> +
> +	ret = pci_set_power_state(dev, PCI_D3hot);
> +	if (ret)
> +		return ret;
> +	msleep(200);
> +
> +	ret = pci_set_power_state(dev, PCI_D0);
> +	if (ret)
> +		return ret;
> +	msleep(200);
> +
> +	return 0;

If we think this is a viable method, it seems like we should use
pci_pm_reset(), which takes care of IOMMU and device readiness issues.

We would have to change pci_pm_reset() to deal with the fact that
PCI_PM_CTRL_NO_SOFT_RESET seems wrong on these devices.  Maybe we
could cache PCI_PM_CTRL_NO_SOFT_RESET in pci_pm_init(), then override
it with quirks for these devices?

> +}
> +
>  static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
>  		 reset_intel_82599_sfp_virtfn },
> @@ -4255,6 +4290,9 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>  		reset_chelsio_generic_dev },
>  	{ PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HINIC_VF,
>  		reset_hinic_vf_dev },
> +	{ PCI_VENDOR_ID_QCOM, 0x1103, reset_qualcomm_d3hot },  /* WCN6855 */
> +	{ PCI_VENDOR_ID_QCOM, 0x1107, reset_qualcomm_d3hot },  /* WCN7850 */
> +	{ PCI_VENDOR_ID_QCOM, 0x0308, reset_qualcomm_d3hot },  /* SDX62/SDX65 */
>  	{ 0 }
>  };
>  
> -- 
> 2.54.0
> 

      parent reply	other threads:[~2026-06-12 15:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12 14:26 [PATCH v9] PCI: Add device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
2026-06-12 15:12 ` Alex Williamson
2026-06-12 15:17 ` Bjorn Helgaas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260612151749.GA603817@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=alex@shazbot.org \
    --cc=ath11k@lists.infradead.org \
    --cc=ath12k@lists.infradead.org \
    --cc=bhelgaas@google.com \
    --cc=jjohnson@kernel.org \
    --cc=jtornosm@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=mani@kernel.org \
    --cc=mhi@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox