All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex@shazbot.org>
To: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Cc: bhelgaas@google.com, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, alex@shazbot.org
Subject: Re: [PATCH] PCI: Add D3cold reset quirk for devices with broken/missing FLR
Date: Thu, 7 May 2026 09:25:18 -0600	[thread overview]
Message-ID: <20260507092518.186c7f1b@shazbot.org> (raw)
In-Reply-To: <20260507142916.392983-1-jtornosm@redhat.com>

On Thu,  7 May 2026 16:29:16 +0200
Jose Ignacio Tornos Martinez <jtornosm@redhat.com> wrote:

> Some PCIe devices require D3cold power state transitions for proper
> firmware reset when used in VFIO passthrough scenarios, but lack
> Function Level Reset (FLR) capability or have incomplete FLR
> implementations that don't fully reset firmware state.
> 
> Known devices affected:
> - Qualcomm ath11k WiFi (17cb:1103) - FLReset-, reset unreliable
> - Qualcomm ath12k WiFi (17cb:1107) - FLReset-, reset always fails
> - Qualcomm SDX62/SDX65 5G modems (17cb:0308) - FLReset-, never
>   initialize in VMs
> - MediaTek mt7925e WiFi (14c3:7925) - FLReset+ but broken, reset always
>   fails
> 
> The problem manifests in two scenarios:
> 
> 1. WiFi devices (ath11k, ath12k, mt7925e): Normal VM operation works fine,
>    including clean shutdown/reboot. However, when the VM terminates
>    uncleanly (crash, force-off), VFIO attempts to reset the device before
>    it can be assigned to another VM. Because FLR is missing or broken, the
>    reset fails and the device remains in an undefined state, preventing
>    reuse.
> 
> 2. Modem devices (SDX62/SDX65): Never successfully initialize even on first
>    VM assignment due to lack of proper reset capability.
> 
> Add reset_device_d3cold() quirk that performs a simple D3cold->D0
> power cycle when the device is bound to vfio-pci. This provides
> firmware reset capability for VM reset operations where standard PCI
> reset methods are insufficient.
> 
> The quirk only applies during VFIO passthrough - native drivers use
> their own reset mechanisms and will fall back to standard PCI reset
> methods by returning -ENOTTY.
> 
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> ---
>  drivers/pci/quirks.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index caaed1a01dc0..11d9a8b562e4 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4237,6 +4237,40 @@ static int reset_hinic_vf_dev(struct pci_dev *pdev, bool probe)
>  	return 0;
>  }
>  
> +/*
> + * Some devices need D3cold->D0 power cycle for proper firmware reset
> + * when used in VFIO passthrough. Some claim FLReset+ but it's incomplete,
> + * others lack FLR entirely, and standard reset methods don't fully reset
> + * firmware state. On bare metal with native drivers, we skip this and let
> + * the driver handle reset via standard methods.
> + */
> +static int reset_device_d3cold(struct pci_dev *dev, bool probe)
> +{
> +	int ret;
> +
> +	if (probe)
> +		return 0;
> +
> +	if (!dev->driver || strcmp(dev->driver->name, "vfio-pci") != 0)

We should not have driver dependent reset behavior.  If FLR is broken,
add these devices to the list of devices using quirk_no_flr() and we'll
fall back to another reset method.  We also shouldn't be implementing a
variant of pci_pm_reset().  PM reset can also be prioritized over FLR
via the reset_methods sysfs attribute if the reset method really is
tied to the usage.  Thanks,

Alex

> +		return -ENOTTY;
> +
> +	/*
> +	 * D3cold->D0 power cycle for firmware reset.
> +	 * VFIO has already disabled interrupts and will handle state
> +	 * save/restore, so we just do the power transition.
> +	 */
> +	ret = pci_set_power_state(dev, PCI_D3cold);
> +	if (ret && ret != -EIO)
> +		pci_warn(dev, "D3cold transition failed: %d\n", ret);
> +
> +	ret = pci_set_power_state(dev, PCI_D0);
> +	if (ret && ret != -EIO)
> +		pci_warn(dev, "D0 transition failed: %d\n", ret);
> +
> +	pci_info(dev, "D3cold reset completed\n");
> +	return 0;
> +}
> +
>  static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>  	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
>  		 reset_intel_82599_sfp_virtfn },
> @@ -4252,6 +4286,10 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
>  		reset_chelsio_generic_dev },
>  	{ PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HINIC_VF,
>  		reset_hinic_vf_dev },
> +	{ PCI_VENDOR_ID_QCOM, 0x1103, reset_device_d3cold },     /* Qualcomm ath11k WiFi */
> +	{ PCI_VENDOR_ID_QCOM, 0x1107, reset_device_d3cold },     /* Qualcomm ath12k WiFi */
> +	{ PCI_VENDOR_ID_QCOM, 0x0308, reset_device_d3cold },     /* Qualcomm SDX62/SDX65 5G modem */
> +	{ PCI_VENDOR_ID_MEDIATEK, 0x7925, reset_device_d3cold }, /* MediaTek mt7925e WiFi */
>  	{ 0 }
>  };
>  


  reply	other threads:[~2026-05-07 15:25 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-07 14:29 [PATCH] PCI: Add D3cold reset quirk for devices with broken/missing FLR Jose Ignacio Tornos Martinez
2026-05-07 15:25 ` Alex Williamson [this message]
2026-05-08 11:05   ` Jose Ignacio Tornos Martinez
2026-05-07 22:42 ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260507092518.186c7f1b@shazbot.org \
    --to=alex@shazbot.org \
    --cc=bhelgaas@google.com \
    --cc=jtornosm@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.