* [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925
@ 2026-05-08 14:51 Jose Ignacio Tornos Martinez
2026-05-08 14:51 ` [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ Jose Ignacio Tornos Martinez
2026-05-08 21:22 ` [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 sashiko-bot
0 siblings, 2 replies; 5+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-05-08 14:51 UTC (permalink / raw)
To: bhelgaas, alex; +Cc: linux-pci, linux-kernel, Jose Ignacio Tornos Martinez
The MediaTek MT7925 WiFi device (14c3:7925) advertises FLR capability
but the implementation is broken - reset always fails, leaving the device
in an undefined state.
This manifests in VFIO passthrough scenarios: Normal VM operation works
fine, including clean shutdown/reboot. However, when the VM terminates
uncleanly (crash, force-off), VFIO attempts to reset the device before
it can be assigned to another VM. Because FLR is broken, the reset fails
and the device remains in an undefined state, preventing reuse.
Disable FLR for this device so the PCI core falls back to working reset
methods (PM reset or bus reset).
This follows the existing pattern used for the MediaTek MT7922 WiFi
(14c3:0616), which is the predecessor device and already uses this quirk.
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v2:
- Change from device-specific D3cold reset to quirk_no_flr() approach
based on maintainer feedback (Alex Williamson)
- Follow existing pattern used by MediaTek MT7922 (0x0616)
v1: https://lore.kernel.org/all/20260507142916.392983-1-jtornosm@redhat.com/
drivers/pci/quirks.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 000000000000..111111111111 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5607,6 +5607,7 @@
* Intel 82579LM Gigabit Ethernet Controller 0x1502
* Intel 82579V Gigabit Ethernet Controller 0x1503
* Mediatek MT7922 802.11ax PCI Express Wireless Network Adapter
+ * Mediatek MT7925 802.11be PCI Express Wireless Network Adapter
*/
static void quirk_no_flr(struct pci_dev *dev)
{
@@ -5617,6 +5618,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x7901, quirk_no_flr);
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr);
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr);
DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x7925, quirk_no_flr);
/* FLR may cause the SolidRun SNET DPU (rev 0x1) to hang */
static void quirk_no_flr_snet(struct pci_dev *dev)
--
2.53.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ 2026-05-08 14:51 [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 Jose Ignacio Tornos Martinez @ 2026-05-08 14:51 ` Jose Ignacio Tornos Martinez 2026-05-08 17:16 ` Alex Williamson 2026-05-08 21:21 ` sashiko-bot 2026-05-08 21:22 ` [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 sashiko-bot 1 sibling, 2 replies; 5+ messages in thread From: Jose Ignacio Tornos Martinez @ 2026-05-08 14:51 UTC (permalink / raw) To: bhelgaas, alex; +Cc: linux-pci, linux-kernel, Jose Ignacio Tornos Martinez Some Qualcomm PCIe devices lack FLR capability and have the NoSoftRst+ flag set in their PM capability. This causes all standard PCI reset methods to return -ENOTTY, leaving the device without any reset capability. Add PCI_DEV_FLAGS_FORCE_PM_RESET flag to bypass the NoSoftRst check and allow PM reset to proceed with the standard D3hot->D0 transition. This provides these devices with a working reset method. Apply this quirk to Qualcomm devices that need PM reset: - ath11k WiFi (17cb:1103) - No FLR, NoSoftRst+, needs reset for reuse - ath12k WiFi (17cb:1107) - No FLR, NoSoftRst+, needs reset for reuse - SDX62/SDX65 5G modems (17cb:0308) - No FLR, NoSoftRst+, never initialize without proper reset (both modem generations share the same PCI device ID) The problem manifests in VFIO passthrough scenarios: 1. WiFi devices (ath11k, ath12k): Normal VM operation works fine, including clean shutdown/reboot. However, when the VM terminates uncleanly (crash, force-off), VFIO attempts to reset the device. Without a working reset method, the device cannot be reused for another VM, preventing device reassignment. 2. Modem devices (SDX62/SDX65): Never successfully initialize even on first VM assignment without proper reset capability. Testing showed that without this quirk, no reset is performed during VFIO device initialization. With this quirk, PM reset succeeds and devices work reliably in VFIO passthrough scenarios. Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> --- v2: - Split from original combined patch based on maintainer feedback (Alex Williamson) and commented results. - Change approach: instead of custom D3cold reset method, enable existing pci_pm_reset() by bypassing NoSoftRst check for affected devices (PCI_DEV_FLAGS_FORCE_PM_RESET flag is added for that) v1: https://lore.kernel.org/all/20260507142916.392983-1-jtornosm@redhat.com/ drivers/pci/pci.c | 12 +++++++++--- drivers/pci/quirks.c | 13 +++++++++++++ include/linux/pci.h | 2 ++ 3 files changed, 24 insertions(+), 3 deletions(-) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 8f7cfcc00090..e0b32eccfcf4 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4451,6 +4451,10 @@ static int pci_af_flr(struct pci_dev *dev, bool probe) * cooldown period, which for the D0->D3hot and D3hot->D0 transitions is 10 ms * by default (i.e. unless the @dev's d3hot_delay field has a different value). * Moreover, only devices in D0 can be reset by this function. + * + * Some devices incorrectly advertise PCI_PM_CTRL_NO_SOFT_RESET but PM reset + * actually works. For such devices, PCI_DEV_FLAGS_FORCE_PM_RESET can be set + * via quirk to bypass the NO_SOFT_RESET check and enable PM reset. */ static int pci_pm_reset(struct pci_dev *dev, bool probe) { @@ -4460,9 +4464,11 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe) if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET) return -ENOTTY; - pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr); - if (csr & PCI_PM_CTRL_NO_SOFT_RESET) - return -ENOTTY; + if (!(dev->dev_flags & PCI_DEV_FLAGS_FORCE_PM_RESET)) { + pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr); + if (csr & PCI_PM_CTRL_NO_SOFT_RESET) + return -ENOTTY; + } if (probe) return 0; diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index caaed1a01dc0..5e8b310c9d5f 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5595,6 +5595,19 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev) } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); +/* + * Some devices incorrectly advertise NoSoftRst+ (suggesting PM reset won't + * work), but PM reset via D3hot->D0 transition actually works fine. Force + * PM reset for these devices to provide working reset capability. + */ +static void quirk_force_pm_reset(struct pci_dev *dev) +{ + dev->dev_flags |= PCI_DEV_FLAGS_FORCE_PM_RESET; +} +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x1103, quirk_force_pm_reset); /* ath11k */ +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x1107, quirk_force_pm_reset); /* ath12k */ +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x0308, quirk_force_pm_reset); /* SDX62/SDX65 */ + /* * FLR may cause the following to devices to hang: * diff --git a/include/linux/pci.h b/include/linux/pci.h index 2c4454583c11..714dbdaa21af 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -261,6 +261,8 @@ enum pci_dev_flags { * integrated with the downstream devices and doesn't use real PCI. */ PCI_DEV_FLAGS_PCI_BRIDGE_NO_ALIAS = (__force pci_dev_flags_t) (1 << 14), + /* Force PM reset even when NoSoftRst+ is set */ + PCI_DEV_FLAGS_FORCE_PM_RESET = (__force pci_dev_flags_t) (1 << 15), }; enum pci_irq_reroute_variant { -- 2.53.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ 2026-05-08 14:51 ` [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ Jose Ignacio Tornos Martinez @ 2026-05-08 17:16 ` Alex Williamson 2026-05-08 21:21 ` sashiko-bot 1 sibling, 0 replies; 5+ messages in thread From: Alex Williamson @ 2026-05-08 17:16 UTC (permalink / raw) To: Jose Ignacio Tornos Martinez; +Cc: bhelgaas, linux-pci, linux-kernel, alex On Fri, 8 May 2026 16:51:53 +0200 Jose Ignacio Tornos Martinez <jtornosm@redhat.com> wrote: > Some Qualcomm PCIe devices lack FLR capability and have the NoSoftRst+ > flag set in their PM capability. This causes all standard PCI reset > methods to return -ENOTTY, leaving the device without any reset capability. > > Add PCI_DEV_FLAGS_FORCE_PM_RESET flag to bypass the NoSoftRst check and > allow PM reset to proceed with the standard D3hot->D0 transition. This > provides these devices with a working reset method. > > Apply this quirk to Qualcomm devices that need PM reset: > - ath11k WiFi (17cb:1103) - No FLR, NoSoftRst+, needs reset for reuse > - ath12k WiFi (17cb:1107) - No FLR, NoSoftRst+, needs reset for reuse > - SDX62/SDX65 5G modems (17cb:0308) - No FLR, NoSoftRst+, never initialize > without proper reset (both modem generations share the same PCI device ID) > > The problem manifests in VFIO passthrough scenarios: > > 1. WiFi devices (ath11k, ath12k): Normal VM operation works fine, > including clean shutdown/reboot. However, when the VM terminates > uncleanly (crash, force-off), VFIO attempts to reset the device. > Without a working reset method, the device cannot be reused for > another VM, preventing device reassignment. > > 2. Modem devices (SDX62/SDX65): Never successfully initialize even on > first VM assignment without proper reset capability. What does reset_methods sysfs attribute report for these devices on an unpatched kernel? I'd tend to expect these are single-function devices where bus reset would be available as a function level reset. Even in the case of a multi-function device, vfio-pci could be performing a bus reset if all the functions are bound to vfio-pci. I'm very suspicious that this is just masking an underlying issue relative to bus reset for these devices, especially if we haven't actually verified the device state is actually reset on transition back to D0 and we're just relying on heuristics that this makes it work. Thanks, Alex > Testing showed that without this quirk, no reset is performed during > VFIO device initialization. With this quirk, PM reset succeeds and > devices work reliably in VFIO passthrough scenarios. > > Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> > --- > v2: > - Split from original combined patch based on maintainer feedback (Alex > Williamson) and commented results. > - Change approach: instead of custom D3cold reset method, enable existing > pci_pm_reset() by bypassing NoSoftRst check for affected devices > (PCI_DEV_FLAGS_FORCE_PM_RESET flag is added for that) > v1: https://lore.kernel.org/all/20260507142916.392983-1-jtornosm@redhat.com/ > > drivers/pci/pci.c | 12 +++++++++--- > drivers/pci/quirks.c | 13 +++++++++++++ > include/linux/pci.h | 2 ++ > 3 files changed, 24 insertions(+), 3 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 8f7cfcc00090..e0b32eccfcf4 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -4451,6 +4451,10 @@ static int pci_af_flr(struct pci_dev *dev, bool probe) > * cooldown period, which for the D0->D3hot and D3hot->D0 transitions is 10 ms > * by default (i.e. unless the @dev's d3hot_delay field has a different value). > * Moreover, only devices in D0 can be reset by this function. > + * > + * Some devices incorrectly advertise PCI_PM_CTRL_NO_SOFT_RESET but PM reset > + * actually works. For such devices, PCI_DEV_FLAGS_FORCE_PM_RESET can be set > + * via quirk to bypass the NO_SOFT_RESET check and enable PM reset. > */ > static int pci_pm_reset(struct pci_dev *dev, bool probe) > { > @@ -4460,9 +4464,11 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe) > if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET) > return -ENOTTY; > > - pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr); > - if (csr & PCI_PM_CTRL_NO_SOFT_RESET) > - return -ENOTTY; > + if (!(dev->dev_flags & PCI_DEV_FLAGS_FORCE_PM_RESET)) { > + pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr); > + if (csr & PCI_PM_CTRL_NO_SOFT_RESET) > + return -ENOTTY; > + } > > if (probe) > return 0; > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index caaed1a01dc0..5e8b310c9d5f 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -5595,6 +5595,19 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev) > } > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); > > +/* > + * Some devices incorrectly advertise NoSoftRst+ (suggesting PM reset won't > + * work), but PM reset via D3hot->D0 transition actually works fine. Force > + * PM reset for these devices to provide working reset capability. > + */ > +static void quirk_force_pm_reset(struct pci_dev *dev) > +{ > + dev->dev_flags |= PCI_DEV_FLAGS_FORCE_PM_RESET; > +} > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x1103, quirk_force_pm_reset); /* ath11k */ > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x1107, quirk_force_pm_reset); /* ath12k */ > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_QCOM, 0x0308, quirk_force_pm_reset); /* SDX62/SDX65 */ > + > /* > * FLR may cause the following to devices to hang: > * > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 2c4454583c11..714dbdaa21af 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -261,6 +261,8 @@ enum pci_dev_flags { > * integrated with the downstream devices and doesn't use real PCI. > */ > PCI_DEV_FLAGS_PCI_BRIDGE_NO_ALIAS = (__force pci_dev_flags_t) (1 << 14), > + /* Force PM reset even when NoSoftRst+ is set */ > + PCI_DEV_FLAGS_FORCE_PM_RESET = (__force pci_dev_flags_t) (1 << 15), > }; > > enum pci_irq_reroute_variant { ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ 2026-05-08 14:51 ` [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ Jose Ignacio Tornos Martinez 2026-05-08 17:16 ` Alex Williamson @ 2026-05-08 21:21 ` sashiko-bot 1 sibling, 0 replies; 5+ messages in thread From: sashiko-bot @ 2026-05-08 21:21 UTC (permalink / raw) To: Jose Ignacio Tornos Martinez; +Cc: linux-pci Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: - [High] The local variable `csr` is left uninitialized and later written to the PCI Power Management Control register when the `PCI_DEV_FLAGS_FORCE_PM_RESET` flag is set. -- commit 1f91cbd53469950e5e046d032ea4bd761ddec438 Author: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> PCI: Force PM reset for Qualcomm devices with NoSoftRst+ This commit adds a new PCI_DEV_FLAGS_FORCE_PM_RESET flag to bypass the PCI_PM_CTRL_NO_SOFT_RESET check during PM reset, applying it as a quirk to several Qualcomm devices. This allows devices without FLR that falsely advertise NoSoftRst+ to be properly reset, which is necessary for VFIO passthrough re-assignment. > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 8f7cfcc000901..e0b32eccfcf42 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -4460,9 +4464,11 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe) > if (!dev->pm_cap || dev->dev_flags & PCI_DEV_FLAGS_NO_PM_RESET) > return -ENOTTY; > > - pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr); > - if (csr & PCI_PM_CTRL_NO_SOFT_RESET) > - return -ENOTTY; > + if (!(dev->dev_flags & PCI_DEV_FLAGS_FORCE_PM_RESET)) { > + pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &csr); > + if (csr & PCI_PM_CTRL_NO_SOFT_RESET) > + return -ENOTTY; > + } > > if (probe) > return 0; If the PCI_DEV_FLAGS_FORCE_PM_RESET flag is set, does this leave the local variable csr uninitialized? Looking at the rest of pci_pm_reset(): drivers/pci/pci.c:pci_pm_reset() { ... csr &= ~PCI_PM_CTRL_STATE_MASK; csr |= PCI_D3hot; pci_write_config_word(dev, dev->pm_cap + PCI_PM_CTRL, csr); ... } Writing uninitialized stack memory back to the PM Control/Status Register could corrupt hardware state like PME Enable and Data Scale, or inadvertently clear Write-1-to-Clear bits like PME Status. Could this cause unpredictable device behavior or lockups? -- Sashiko AI review · https://sashiko.dev/#/patchset/20260508145153.717641-2-jtornosm@redhat.com?part=1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 2026-05-08 14:51 [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 Jose Ignacio Tornos Martinez 2026-05-08 14:51 ` [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ Jose Ignacio Tornos Martinez @ 2026-05-08 21:22 ` sashiko-bot 1 sibling, 0 replies; 5+ messages in thread From: sashiko-bot @ 2026-05-08 21:22 UTC (permalink / raw) To: Jose Ignacio Tornos Martinez; +Cc: linux-pci Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: - [Medium] Incomplete implementation: Missing MediaTek MT7925 device ID `0x0717` in the FLR quirk list. -- commit 2cb7494208e4978bb4c09df85e374c7eac96b74d Author: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> PCI: Disable broken FLR on MediaTek MT7925 The MediaTek MT7925 WiFi device advertises FLR capability but its implementation is broken, causing reset to fail and leaving the device in an undefined state. This commit disables FLR for the device so the PCI core can fall back to working reset methods like PM reset or bus reset. > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index caaed1a01dc02..e49136ac5dbfb 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c [ ... ] > @@ -5619,6 +5620,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMD, 0x17f0, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1502, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1503, quirk_no_flr); > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x0616, quirk_no_flr); > +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_MEDIATEK, 0x7925, quirk_no_flr); Should this quirk also include the 0x0717 device ID? The MT7925 driver in drivers/net/wireless/mediatek/mt76/mt7925/pci.c registers and supports two MediaTek device IDs: 0x7925 and 0x0717. Both IDs share the same initialization logic, MAC behavior, and firmware. Since 0x0717 belongs to the same MT7925 family and shares the same underlying hardware IP, is it likely to have the same broken FLR implementation? If so, users with the 0x0717 variant might still experience the VFIO passthrough undefined state issue if the device is omitted from this quirk list. -- Sashiko AI review · https://sashiko.dev/#/patchset/20260508145153.717641-1-jtornosm@redhat.com?part=1 ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-08 21:22 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-08 14:51 [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 Jose Ignacio Tornos Martinez 2026-05-08 14:51 ` [PATCH v2] PCI: Force PM reset for Qualcomm devices with NoSoftRst+ Jose Ignacio Tornos Martinez 2026-05-08 17:16 ` Alex Williamson 2026-05-08 21:21 ` sashiko-bot 2026-05-08 21:22 ` [PATCH v2] PCI: Disable broken FLR on MediaTek MT7925 sashiko-bot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox