The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925
@ 2026-05-08 14:51 Jose Ignacio Tornos Martinez
  2026-05-08 14:51 ` [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ Jose Ignacio Tornos Martinez
  0 siblings, 1 reply; 5+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-05-08 14:51 UTC (permalink / raw)
  To: bhelgaas, alex; +Cc: linux-pci, linux-kernel, Jose Ignacio Tornos Martinez

The MediaTek MT7925 WiFi device (14c3:7925) advertises FLR capability
but the implementation is broken - reset always fails, leaving the device
in an undefined state.

This manifests in VFIO passthrough scenarios: Normal VM operation works
fine, including clean shutdown/reboot. However, when the VM terminates
uncleanly (crash, force-off), VFIO attempts to reset the device before
it can be assigned to another VM. Because FLR is broken, the reset fails
and the device remains in an undefined state, preventing reuse.

Disable FLR for this device so the PCI core falls back to working reset
methods (PM reset or bus reset).

This follows the existing pattern used for the MediaTek MT7922 WiFi
(14c3:0616), which is the predecessor device and already uses this quirk.

Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v2:
  - Change from device-specific D3cold reset to quirk_no_flr() approach
    based on maintainer feedback (Alex Williamson)
  - Follow existing pattern used by MediaTek MT7922 (0x0616)
v1: https://lore.kernel.org/all/20260507142916.392983-1-jtornosm@redhat.com/

 drivers/pci/quirks.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 000000000000..111111111111 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5607,6 +5607,7 @@
  * Intel 82579LM Gigabit Ethernet Controller 0x1502
  * Intel 82579V Gigabit Ethernet Controller 0x1503
  * Mediatek MT7922 802.11ax PCI Express Wireless Network Adapter
+ * Mediatek MT7925 802.11be PCI Express Wireless Network Adapter
  */
 static void quirk_no_flr(struct pci_dev *dev)
 {
@@ -5617,6 +5618,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x7925, quirk_no_flr);

 /* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */
 static void quirk_no_flr_snet(struct pci_dev *dev)
--
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+
  2026-05-08 14:51 [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 Jose Ignacio Tornos Martinez
@ 2026-05-08 14:51 ` Jose Ignacio Tornos Martinez
  2026-05-08 17:16   ` Alex Williamson
  0 siblings, 1 reply; 5+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-05-08 14:51 UTC (permalink / raw)
  To: bhelgaas, alex; +Cc: linux-pci, linux-kernel, Jose Ignacio Tornos Martinez

Some Qualcomm PCIe devices lack FLR capability and have the NoSoftRst+
flag set in their PM capability. This causes all standard PCI reset
methods to return -ENOTTY, leaving the device without any reset capability.

Add PCI_DEV_FLAGS_FORCE_PM_RESET flag to bypass the NoSoftRst check and
allow PM reset to proceed with the standard D3hot->D0 transition. This
provides these devices with a working reset method.

Apply this quirk to Qualcomm devices that need PM reset:
- ath11k WiFi (17cb:1103) - No FLR, NoSoftRst+, needs reset for reuse
- ath12k WiFi (17cb:1107) - No FLR, NoSoftRst+, needs reset for reuse
- SDX62/SDX65 5G modems (17cb:0308) - No FLR, NoSoftRst+, never initialize
  without proper reset (both modem generations share the same PCI device ID)

The problem manifests in VFIO passthrough scenarios:

1. WiFi devices (ath11k, ath12k): Normal VM operation works fine,
   including clean shutdown/reboot. However, when the VM terminates
   uncleanly (crash, force-off), VFIO attempts to reset the device.
   Without a working reset method, the device cannot be reused for
   another VM, preventing device reassignment.

2. Modem devices (SDX62/SDX65): Never successfully initialize even on
   first VM assignment without proper reset capability.

Testing showed that without this quirk, no reset is performed during
VFIO device initialization. With this quirk, PM reset succeeds and
devices work reliably in VFIO passthrough scenarios.

Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v2:
  - Split from original combined patch based on maintainer feedback (Alex
    Williamson) and commented results.
  - Change approach: instead of custom D3cold reset method, enable existing
    pci_pm_reset() by bypassing NoSoftRst check for affected devices
    (PCI_DEV_FLAGS_FORCE_PM_RESET flag is added for that)
v1: https://lore.kernel.org/all/20260507142916.392983-1-jtornosm@redhat.com/

 drivers/pci/pci.c    | 12 +++++++++---
 drivers/pci/quirks.c | 13 +++++++++++++
 include/linux/pci.h  |  2 ++
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8f7cfcc00090..e0b32eccfcf4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4451,6 +4451,10 @@ static int pci_af_flr(struct pci_dev *dev, bool probe)
  * cooldown period, which for the D0->D3hot and D3hot->D0 transitions is 10 ms
  * by default (i.e. unless the @dev's d3hot_delay field has a different value).
  * Moreover, only devices in D0 can be reset by this function.
+ *
+ * Some devices incorrectly advertise PCI_PM_CTRL_NO_SOFT_RESET but PM reset
+ * actually works. For such devices, PCI_DEV_FLAGS_FORCE_PM_RESET can be set
+ * via quirk to bypass the NO_SOFT_RESET check and enable PM reset.
  */
 static int pci_pm_reset(struct pci_dev *dev, bool probe)
 {
@@ -4460,9 +4464,11 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
 	if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
 		return -ENOTTY;
 
-	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr);
-	if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
-		return -ENOTTY;
+	if (!(dev->dev_flags & PCI_DEV_FLAGS_FORCE_PM_RESET)) {
+		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr);
+		if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
+			return -ENOTTY;
+	}
 
 	if (probe)
 		return 0;
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index caaed1a01dc0..5e8b310c9d5f 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5595,6 +5595,19 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
 }
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
 
+/*
+ * Some devices incorrectly advertise NoSoftRst+ (suggesting PM reset won't
+ * work), but PM reset via D3hot->D0 transition actually works fine. Force
+ * PM reset for these devices to provide working reset capability.
+ */
+static void quirk_force_pm_reset(struct pci_dev *dev)
+{
+	dev->dev_flags |= PCI_DEV_FLAGS_FORCE_PM_RESET;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x1103, quirk_force_pm_reset); /* ath11k */
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x1107, quirk_force_pm_reset); /* ath12k */
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x0308, quirk_force_pm_reset); /* SDX62/SDX65 */
+
 /*
  * FLR may cause the following to devices to hang:
  *
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c4454583c11..714dbdaa21af 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -261,6 +261,8 @@ enum pci_dev_flags {
 	 * integrated with the downstream devices and doesn't use real PCI.
 	 */
 	PCI_DEV_FLAGS_PCI_BRIDGE_NO_ALIAS = (__force pci_dev_flags_t) (1 << 14),
+	/* Force PM reset even when NoSoftRst+ is set */
+	PCI_DEV_FLAGS_FORCE_PM_RESET = (__force pci_dev_flags_t) (1 << 15),
 };
 
 enum pci_irq_reroute_variant {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+
  2026-05-08 14:51 ` [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ Jose Ignacio Tornos Martinez
@ 2026-05-08 17:16   ` Alex Williamson
  2026-05-11 12:26     ` Jose Ignacio Tornos Martinez
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2026-05-08 17:16 UTC (permalink / raw)
  To: Jose Ignacio Tornos Martinez; +Cc: bhelgaas, linux-pci, linux-kernel, alex

On Fri,  8 May 2026 16:51:53 +0200
Jose Ignacio Tornos Martinez <jtornosm@redhat.com> wrote:

> Some Qualcomm PCIe devices lack FLR capability and have the NoSoftRst+
> flag set in their PM capability. This causes all standard PCI reset
> methods to return -ENOTTY, leaving the device without any reset capability.
> 
> Add PCI_DEV_FLAGS_FORCE_PM_RESET flag to bypass the NoSoftRst check and
> allow PM reset to proceed with the standard D3hot->D0 transition. This
> provides these devices with a working reset method.
> 
> Apply this quirk to Qualcomm devices that need PM reset:
> - ath11k WiFi (17cb:1103) - No FLR, NoSoftRst+, needs reset for reuse
> - ath12k WiFi (17cb:1107) - No FLR, NoSoftRst+, needs reset for reuse
> - SDX62/SDX65 5G modems (17cb:0308) - No FLR, NoSoftRst+, never initialize
>   without proper reset (both modem generations share the same PCI device ID)
> 
> The problem manifests in VFIO passthrough scenarios:
> 
> 1. WiFi devices (ath11k, ath12k): Normal VM operation works fine,
>    including clean shutdown/reboot. However, when the VM terminates
>    uncleanly (crash, force-off), VFIO attempts to reset the device.
>    Without a working reset method, the device cannot be reused for
>    another VM, preventing device reassignment.
> 
> 2. Modem devices (SDX62/SDX65): Never successfully initialize even on
>    first VM assignment without proper reset capability.

What does reset_methods sysfs attribute report for these devices on an
unpatched kernel?

I'd tend to expect these are single-function devices where bus reset
would be available as a function level reset.  Even in the case of a
multi-function device, vfio-pci could be performing a bus reset if all
the functions are bound to vfio-pci.  I'm very suspicious that this is
just masking an underlying issue relative to bus reset for these
devices, especially if we haven't actually verified the device state is
actually reset on transition back to D0 and we're just relying on
heuristics that this makes it work.  Thanks,

Alex

> Testing showed that without this quirk, no reset is performed during
> VFIO device initialization. With this quirk, PM reset succeeds and
> devices work reliably in VFIO passthrough scenarios.
> 
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> ---
> v2:
>   - Split from original combined patch based on maintainer feedback (Alex
>     Williamson) and commented results.
>   - Change approach: instead of custom D3cold reset method, enable existing
>     pci_pm_reset() by bypassing NoSoftRst check for affected devices
>     (PCI_DEV_FLAGS_FORCE_PM_RESET flag is added for that)
> v1: https://lore.kernel.org/all/20260507142916.392983-1-jtornosm@redhat.com/
> 
>  drivers/pci/pci.c    | 12 +++++++++---
>  drivers/pci/quirks.c | 13 +++++++++++++
>  include/linux/pci.h  |  2 ++
>  3 files changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8f7cfcc00090..e0b32eccfcf4 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4451,6 +4451,10 @@ static int pci_af_flr(struct pci_dev *dev, bool probe)
>   * cooldown period, which for the D0->D3hot and D3hot->D0 transitions is 10 ms
>   * by default (i.e. unless the @dev's d3hot_delay field has a different value).
>   * Moreover, only devices in D0 can be reset by this function.
> + *
> + * Some devices incorrectly advertise PCI_PM_CTRL_NO_SOFT_RESET but PM reset
> + * actually works. For such devices, PCI_DEV_FLAGS_FORCE_PM_RESET can be set
> + * via quirk to bypass the NO_SOFT_RESET check and enable PM reset.
>   */
>  static int pci_pm_reset(struct pci_dev *dev, bool probe)
>  {
> @@ -4460,9 +4464,11 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
>  	if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET)
>  		return -ENOTTY;
>  
> -	pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr);
> -	if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
> -		return -ENOTTY;
> +	if (!(dev->dev_flags & PCI_DEV_FLAGS_FORCE_PM_RESET)) {
> +		pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr);
> +		if (csr & PCI_PM_CTRL_NO_SOFT_RESET)
> +			return -ENOTTY;
> +	}
>  
>  	if (probe)
>  		return 0;
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index caaed1a01dc0..5e8b310c9d5f 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5595,6 +5595,19 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
>  }
>  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap);
>  
> +/*
> + * Some devices incorrectly advertise NoSoftRst+ (suggesting PM reset won't
> + * work), but PM reset via D3hot->D0 transition actually works fine. Force
> + * PM reset for these devices to provide working reset capability.
> + */
> +static void quirk_force_pm_reset(struct pci_dev *dev)
> +{
> +	dev->dev_flags |= PCI_DEV_FLAGS_FORCE_PM_RESET;
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x1103, quirk_force_pm_reset); /* ath11k */
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x1107, quirk_force_pm_reset); /* ath12k */
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x0308, quirk_force_pm_reset); /* SDX62/SDX65 */
> +
>  /*
>   * FLR may cause the following to devices to hang:
>   *
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2c4454583c11..714dbdaa21af 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -261,6 +261,8 @@ enum pci_dev_flags {
>  	 * integrated with the downstream devices and doesn't use real PCI.
>  	 */
>  	PCI_DEV_FLAGS_PCI_BRIDGE_NO_ALIAS = (__force pci_dev_flags_t) (1 << 14),
> +	/* Force PM reset even when NoSoftRst+ is set */
> +	PCI_DEV_FLAGS_FORCE_PM_RESET = (__force pci_dev_flags_t) (1 << 15),
>  };
>  
>  enum pci_irq_reroute_variant {


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+
  2026-05-08 17:16   ` Alex Williamson
@ 2026-05-11 12:26     ` Jose Ignacio Tornos Martinez
  2026-05-11 19:36       ` Alex Williamson
  0 siblings, 1 reply; 5+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-05-11 12:26 UTC (permalink / raw)
  To: alex; +Cc: bhelgaas, jtornosm, linux-kernel, linux-pci

Hello Alex,

Thank you again for your review.                                                                                                                                               
                                                            
> What does reset_methods sysfs attribute report for these devices on an
> unpatched kernel?
The kernel we use doesn't have CONFIG_PCI_RESET_SYSFS enabled,
so reset_methods is not available. However, I can provide the actual
behavior observed through testing and dmesg logs.
 
> I'd tend to expect these are single-function devices where bus reset
> would be available as a function level reset.
Yes, these are single-function devices (PCI header type 00).
For example, here's the ath11k device:                                                                                                                                                                                 lspci -xxx -s 0000:03:00.0 | head -2
    03:00.0 Network controller: Qualcomm Technologies, Inc QCNFA765
    00: cb 17 03 11 06 05 10 00 01 00 80 02 10 00 00 00                                                                                                                                                              
                                                  ^^
                                     Header type: 00 (single-function)
 
> I'm very suspicious that this is just masking an underlying issue
> relative to bus reset for these devices
Yes, you are right, there is an underlying bus reset issue. Let me explain
what I have observed through the testing:
Testing showed no reset is performed at all. During both VM startup and
virsh reset operations, there are no reset-related messages in dmesg.
The reset hierarchy returns -ENOTTY at each step:
  - No FLR (device doesn't advertise it)
  - PM reset returns -ENOTTY (NoSoftRst+ flag)
  - Bus reset apparently not attempted
When testing the suggested quirk_no_flr() approach (which worked for
mt7925e), dmesg shows secondary bus reset is attempted:
  vfio-pci 0000:06:00.0: enabling device (0000 -> 0002)
  vfio-pci 0000:06:00.0: resetting
  pcieport 0000:00:1c.4: unlocked secondary bus reset via: __pci_reset_function_locked
  vfio-pci 0000:06:00.0: reset done
However, the device becomes unresponsive after this:
  lspci -vvvvvvvvvvvv -s 0000:03:00.0
    03:00.0 Network controller: Qualcomm Technologies, Inc (rev ff) (prog-if ff)
        !!! Unknown header type 7f
And all config space reads return 0xFF, indicating the device is not
responding after bus reset.
If we use PM reset (D3hot->D0) succeeds and the device works correctly
through multiple VM lifecycles (startup, virsh reset, shutdown/restart).
                                                                                                                                                                                                                     
> especially if we haven't actually verified the device state is                                                                                                                                                   
> actually reset on transition back to D0                                                                                                                                                                          
The verification is functional: with our patch, the device successfully
initializes in the guest after VM reset operations, and continues working
through multiple reset cycles. Without a working reset (default kernel),
WiFi devices (ath11k, ath12k) cannot be reused after VM termination, and
modem devices (SDX62/SDX65) fail to initialize even on first VM assignment.
                                                                                                                                                                                                                     
Summary:
You're correct that there's a bus reset issue, SBR breaks these devices.
The question is whether we should:
  1. Investigate why SBR breaks these single-function devices
  2. Use PM reset which demonstrably works
Option 1 may involve firmware-level investigation, while the PM reset
approach provides a working solution.
This situation is similar to existing quirks: quirk_no_flr() works around
devices with broken FLR implementations. Here we're working around devices
that incorrectly advertise NoSoftRst+ (preventing PM reset) while SBR doesn't
work properly.
I'm open to your guidance on the best path forward.

Thanks

Best regards
José Ignacio


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+
  2026-05-11 12:26     ` Jose Ignacio Tornos Martinez
@ 2026-05-11 19:36       ` Alex Williamson
  0 siblings, 0 replies; 5+ messages in thread
From: Alex Williamson @ 2026-05-11 19:36 UTC (permalink / raw)
  To: Jose Ignacio Tornos Martinez; +Cc: bhelgaas, linux-kernel, linux-pci, alex

On Mon, 11 May 2026 14:26:21 +0200
Jose Ignacio Tornos Martinez <jtornosm@redhat.com> wrote:
                                                       
>                                                             
> > What does reset_methods sysfs attribute report for these devices on an
> > unpatched kernel?  
> The kernel we use doesn't have CONFIG_PCI_RESET_SYSFS enabled,
> so reset_methods is not available. However, I can provide the actual
> behavior observed through testing and dmesg logs.

What kernel is this?  I don't find any reference to such a Kconfig
option.

> > I'd tend to expect these are single-function devices where bus reset
> > would be available as a function level reset.  
> Yes, these are single-function devices (PCI header type 00).
> For example, here's the ath11k device:  lspci -xxx -s 0000:03:00.0 | head -2
>     03:00.0 Network controller: Qualcomm Technologies, Inc QCNFA765
>     00: cb 17 03 11 06 05 10 00 01 00 80 02 10 00 00 00
>                                                   ^^
>                                      Header type: 00 (single-function)
>  
> > I'm very suspicious that this is just masking an underlying issue
> > relative to bus reset for these devices  
> Yes, you are right, there is an underlying bus reset issue. Let me explain
> what I have observed through the testing:
> Testing showed no reset is performed at all. During both VM startup and
> virsh reset operations, there are no reset-related messages in dmesg.
> The reset hierarchy returns -ENOTTY at each step:
>   - No FLR (device doesn't advertise it)
>   - PM reset returns -ENOTTY (NoSoftRst+ flag)
>   - Bus reset apparently not attempted

Bus reset should be used for function level reset of a single function
device unless either the downstream port or the endpoint are quirked to
prevent it.  I don't see any such quirk for 17cb:1103.  What's the ID
of the root port?

> When testing the suggested quirk_no_flr() approach (which worked for
> mt7925e), dmesg shows secondary bus reset is attempted:
>   vfio-pci 0000:06:00.0: enabling device (0000 -> 0002)
>   vfio-pci 0000:06:00.0: resetting
>   pcieport 0000:00:1c.4: unlocked secondary bus reset via: __pci_reset_function_locked
>   vfio-pci 0000:06:00.0: reset done
> However, the device becomes unresponsive after this:
>   lspci -vvvvvvvvvvvv -s 0000:03:00.0
>     03:00.0 Network controller: Qualcomm Technologies, Inc (rev ff) (prog-if ff)
>         !!! Unknown header type 7f
> And all config space reads return 0xFF, indicating the device is not
> responding after bus reset.
> If we use PM reset (D3hot->D0) succeeds and the device works correctly
> through multiple VM lifecycles (startup, virsh reset, shutdown/restart).
>
> > especially if we haven't actually verified the device state is
> > actually reset on transition back to D0                 
> The verification is functional: with our patch, the device successfully
> initializes in the guest after VM reset operations, and continues working
> through multiple reset cycles. Without a working reset (default kernel),
> WiFi devices (ath11k, ath12k) cannot be reused after VM termination, and
> modem devices (SDX62/SDX65) fail to initialize even on first VM assignment.
>
> Summary:
> You're correct that there's a bus reset issue, SBR breaks these devices.
> The question is whether we should:
>   1. Investigate why SBR breaks these single-function devices

Then why aren't we setting quirks to use quirk_no_bus_reset() for these
devices?

>   2. Use PM reset which demonstrably works
> Option 1 may involve firmware-level investigation, while the PM reset
> approach provides a working solution.
> This situation is similar to existing quirks: quirk_no_flr() works around
> devices with broken FLR implementations. Here we're working around devices
> that incorrectly advertise NoSoftRst+ (preventing PM reset) while SBR doesn't
> work properly.
> I'm open to your guidance on the best path forward.

Proving that an advertised reset method doesn't work is much easier
than proving an unadvertised reset method does work.  What's being
proposed here effectively ignores 1) while asserting that 2) then
works.  Does 2) work only because it prevents the fall through to 1),
which is known broken, or does it have merit on its own.  I can't tell.

Whether supported in your kernel or not, the mainline kernel does also
have support for modifying reset method priorities through sysfs, so
the fall through order assumed here isn't necessarily what everyone
will experience.

I would start with disabling the reset methods that are known broken,
FLR and bus reset.  Test whether that results in reliable behavior.

If that's still not as reliable as you're seeing by adding the
transition through D3hot, then I'd be open to the discussion of whether
these devices do in fact need a device specific reset or quirk to PM
reset (and everywhere else that tests PCI_PM_CTRL_NO_SOFT_RESET).

The previous patch[1] proposed a device specific reset passing the
device through D3cold.  This muddies the waters a bit because D3cold
will actually power off the device causing a reset, but the ability to
enter D3cold depends on the platform, not the device.  We can't tell
from the code what state the device actually entered there.

OTOH, the quirk proposed here would only achieve D3hot.  Are the BAR
values preserved or cleared immediately after transition to D0?  If
cleared, that could provide supporting evidence that NoSoftRst is
actually misrepresented by the device.  If not, we're really just
looking at a heuristic that an internal reset might be occurring, but
only the vendor could confirm.  Thanks,

Alex


PS - D3cold might be an interesting reset method that could be
implemented for single function endpoints in slots that support it.

[1]https://lore.kernel.org/all/20260507142916.392983-1-jtornosm@redhat.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-11 19:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-08 14:51 [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 Jose Ignacio Tornos Martinez
2026-05-08 14:51 ` [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ Jose Ignacio Tornos Martinez
2026-05-08 17:16   ` Alex Williamson
2026-05-11 12:26     ` Jose Ignacio Tornos Martinez
2026-05-11 19:36       ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox