* [PATCH v7 0/3] PCI: Add d3cold and device-specific reset for Qualcomm devices
@ 2026-06-03 10:58 Jose Ignacio Tornos Martinez
2026-06-03 10:58 ` [PATCH v7 1/3] PCI: Add d3cold as general reset method Jose Ignacio Tornos Martinez
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-06-03 10:58 UTC (permalink / raw)
To: bhelgaas, alex
Cc: jjohnson, mani, linux-pci, linux-wireless, ath11k, ath12k, mhi,
linux-kernel, Jose Ignacio Tornos Martinez
Some PCIe devices lack working reset methods for VFIO passthrough scenarios.
These devices typically have no FLR, advertise NoSoftRst+ (blocking PM reset),
and have broken or unavailable bus reset. When a VM crashes, VFIO cannot reset
the device for reuse without a working reset method.
This series addresses the problem by adding general d3cold infrastructure and
device-specific reset for Qualcomm devices:
**Patch 1/3: d3cold reset method**
Adds D3cold as a general reset method with strict _PR3 requirement. Only
attempts true power cycling via platform control (ACPI _PR3), never falls
back to D3hot. This provides genuine power-off reset on platforms that
support it.
**Patch 2/3: device-specific reset for Qualcomm devices**
Adds device-specific reset entries for Qualcomm WCN6855/WCN7850 WiFi cards and
SDX62/SDX65 modems using D3cold power cycling with automatic D3hot fallback.
Uses pci_set_power_state(D3cold) which automatically falls back to D3hot on
platforms without _PR3. Extracts shared pci_dev_d3cold_d0_cycle() helper to
avoid code duplication with general d3cold method. Device-specific reset is
position #1 in reset hierarchy, providing immediate working reset for these
Qualcomm devices.
**Patch 3/3: Qualcomm quirk_no_bus_reset**
Disables broken bus reset for Qualcomm devices. Testing proves this is
device-specific: MediaTek MT7925e works correctly with bus reset using the
same passive M.2-to-PCIe adapters where Qualcomm devices fail, confirming
PERST# is properly wired and the issue is not deployment-specific. Acts as
safety net to prevent broken SBR if users override via sysfs.
v7: Address the comments from Jeff Johnson: the problem was detected for
the specific devices WCN6855 and WCN7850, use the name of the devices
instead of the related drivers ath11k and ath12k, which support a large
number of devices (patches 2/3 and 3/3).
v6: https://lore.kernel.org/all/20260602160024.1171949-1-jtornosm@redhat.com/
Jose Ignacio Tornos Martinez (3):
PCI: Add d3cold as general reset method
PCI: Add device-specific reset for Qualcomm devices
PCI: Disable broken bus reset on Qualcomm devices
--
2.53.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v7 1/3] PCI: Add d3cold as general reset method
2026-06-03 10:58 [PATCH v7 0/3] PCI: Add d3cold and device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
@ 2026-06-03 10:58 ` Jose Ignacio Tornos Martinez
2026-06-03 11:20 ` sashiko-bot
2026-06-10 18:42 ` Bjorn Helgaas
2026-06-03 10:58 ` [PATCH v6 2/3] PCI: Add device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
2026-06-03 10:58 ` [PATCH v7 3/3] PCI: Disable broken bus reset on " Jose Ignacio Tornos Martinez
2 siblings, 2 replies; 9+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-06-03 10:58 UTC (permalink / raw)
To: bhelgaas, alex
Cc: jjohnson, mani, linux-pci, linux-wireless, ath11k, ath12k, mhi,
linux-kernel, Jose Ignacio Tornos Martinez
Add D3cold power cycle as a general PCI reset method for single-function
devices on platforms with ACPI _PR3 power resources. This provides true
power cycle reset capability when the platform can physically cut power
to the device.
The implementation strictly requires _PR3 to be present - the platform
must be able to control device power. This ensures d3cold only attempts
true power cycling, not falling back to D3hot transitions.
D3cold reset is placed at the end of the reset hierarchy since it requires
specific platform support and should be tried after standard methods.
Reset hierarchy with this change:
1. device_specific
2. acpi
3. flr
4. af_flr
5. pm (D3hot via config space, checks NoSoftRst)
6. bus (SBR)
7. cxl_bus
8. d3cold (NEW - true power cycle, requires _PR3)
This benefits:
- Platforms with _PR3 support
- Single-function devices needing true power cycle
- VFIO passthrough scenarios where FLR/PM unavailable
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v7: code unchanged from v6
v6: https://lore.kernel.org/all/20260602160024.1171949-2-jtornosm@redhat.com/
drivers/pci/pci.c | 50 +++++++++++++++++++++++++++++++++++++++++++++
include/linux/pci.h | 2 +-
2 files changed, 51 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8f7cfcc00090..096868f80cd4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4491,6 +4491,55 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
return ret;
}
+/**
+ * pci_d3cold_reset - Put device into D3cold and back to D0 for reset
+ * @dev: PCI device to reset
+ * @probe: if true, check if D3cold reset is supported; if false, perform reset
+ *
+ * Reset the device by transitioning through D3cold (actual power removal via
+ * platform power control) and back to D0. This requires ACPI _PR3 power
+ * resources to be present - the platform must be able to physically cut power
+ * to the device.
+ *
+ * Only available for single-function devices to avoid affecting other
+ * functions in multi-function devices.
+ *
+ * Returns 0 if device can be/was reset this way, -ENOTTY if not supported,
+ * or other negative error code on failure.
+ */
+static int pci_d3cold_reset(struct pci_dev *dev, bool probe)
+{
+ int ret;
+
+ if (dev->multifunction)
+ return -ENOTTY;
+
+ if (!pci_pr3_present(dev))
+ return -ENOTTY;
+
+ if (probe)
+ return 0;
+
+ if (dev->current_state != PCI_D0)
+ return -EINVAL;
+
+ ret = pci_dev_reset_iommu_prepare(dev);
+ if (ret) {
+ pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
+ return ret;
+ }
+
+ ret = pci_set_power_state(dev, PCI_D3cold);
+ if (ret)
+ goto done;
+
+ ret = pci_set_power_state(dev, PCI_D0);
+
+done:
+ pci_dev_reset_iommu_done(dev);
+ return ret;
+}
+
/**
* pcie_wait_for_link_status - Wait for link status change
* @pdev: Device whose link to wait for.
@@ -5065,6 +5114,7 @@ const struct pci_reset_fn_method pci_reset_fn_methods[] = {
{ pci_pm_reset, .name = "pm" },
{ pci_reset_bus_function, .name = "bus" },
{ cxl_reset_bus_function, .name = "cxl_bus" },
+ { pci_d3cold_reset, .name = "d3cold" },
};
/**
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c4454583c11..1ca7b880ead7 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -51,7 +51,7 @@
PCI_STATUS_PARITY)
/* Number of reset methods used in pci_reset_fn_methods array in pci.c */
-#define PCI_NUM_RESET_METHODS 8
+#define PCI_NUM_RESET_METHODS 9
#define PCI_RESET_PROBE true
#define PCI_RESET_DO_RESET false
--
2.54.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v6 2/3] PCI: Add device-specific reset for Qualcomm devices
2026-06-03 10:58 [PATCH v7 0/3] PCI: Add d3cold and device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
2026-06-03 10:58 ` [PATCH v7 1/3] PCI: Add d3cold as general reset method Jose Ignacio Tornos Martinez
@ 2026-06-03 10:58 ` Jose Ignacio Tornos Martinez
2026-06-03 11:39 ` sashiko-bot
2026-06-03 10:58 ` [PATCH v7 3/3] PCI: Disable broken bus reset on " Jose Ignacio Tornos Martinez
2 siblings, 1 reply; 9+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-06-03 10:58 UTC (permalink / raw)
To: bhelgaas, alex
Cc: jjohnson, mani, linux-pci, linux-wireless, ath11k, ath12k, mhi,
linux-kernel, Jose Ignacio Tornos Martinez
Some Qualcomm PCIe devices (WCN6855/WCN7850 WiFi cards, SDX62/SDX65 modems)
lack working reset methods for VFIO passthrough scenarios. These devices
have no FLR capability, advertise NoSoftRst+ (blocking PM reset), and have
broken bus reset.
The problem manifests in VFIO passthrough scenarios:
- WCN6855 WiFi card (17cb:1103): Normal VM operation works fine, including
clean shutdown/reboot. However, when the VM terminates uncleanly
(crash, force-off), VFIO attempts to reset the device before it can
be assigned to another VM. Without a working reset method, the device
remains in an undefined state, preventing reuse.
- WCN7850 WiFi card (17cb:1107): Same behavior as WCN6855.
- SDX62/SDX65 5G modems (17cb:0308): Never successfully initialize even
on first VM assignment without proper reset capability.
Add device-specific reset entries for these Qualcomm devices using D3cold
power cycling with automatic D3hot fallback. The implementation uses
pci_set_power_state(D3cold) which automatically falls back to D3hot on
platforms without ACPI _PR3 power resources. While not a complete reset
(BARs preserved), testing shows D3hot transition provides sufficient reset
for VFIO reuse.
Extract a shared pci_dev_d3cold_d0_cycle() helper function to avoid code
duplication between pci_d3cold_reset() (strict _PR3 requirement) and the
new reset_d3cold_d3hot() device-specific reset (automaticfallback). The
helper handles IOMMU preparation, performs the power cycle via
pci_set_power_state(), and cleans up IOMMU state.
Device-specific reset is position #1 in the reset hierarchy, so these
Qualcomm devices will use power cycling as their primary reset method,
with the general d3cold method (position #8) available as a fallback on
_PR3-capable platforms if users override via sysfs.
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v7: Address the comments from Jeff Johnson: the problem was detected for
the specific devices WCN6855 and WCN7850, use the name of the devices
instead of the related drivers ath11k and ath12k, which support a large
number of devices
v6: https://lore.kernel.org/all/20260602160024.1171949-3-jtornosm@redhat.com/
drivers/pci/pci.c | 37 +++++++++++++++++++++++++++----------
drivers/pci/pci.h | 1 +
drivers/pci/quirks.c | 19 +++++++++++++++++++
3 files changed, 47 insertions(+), 10 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 096868f80cd4..f7a7443287fd 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4491,6 +4491,32 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
return ret;
}
+/**
+ * pci_dev_d3cold_d0_cycle - Perform D3cold->D0 power cycle
+ * @dev: Device to power cycle
+ *
+ * Common helper to perform D3cold->D0 power cycle for reset methods.
+ * Attempts D3cold transition with automatic fallback to D3hot on platforms
+ * without ACPI _PR3 power resources.
+ *
+ * Caller must handle IOMMU preparation/cleanup if needed.
+ *
+ * Returns 0 on success, negative error code on failure.
+ */
+int pci_dev_d3cold_d0_cycle(struct pci_dev *dev)
+{
+ int ret;
+
+ if (dev->current_state != PCI_D0)
+ return -EINVAL;
+
+ ret = pci_set_power_state(dev, PCI_D3cold);
+ if (ret)
+ return ret;
+
+ return pci_set_power_state(dev, PCI_D0);
+}
+
/**
* pci_d3cold_reset - Put device into D3cold and back to D0 for reset
* @dev: PCI device to reset
@@ -4520,22 +4546,13 @@ static int pci_d3cold_reset(struct pci_dev *dev, bool probe)
if (probe)
return 0;
- if (dev->current_state != PCI_D0)
- return -EINVAL;
-
ret = pci_dev_reset_iommu_prepare(dev);
if (ret) {
pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
return ret;
}
- ret = pci_set_power_state(dev, PCI_D3cold);
- if (ret)
- goto done;
-
- ret = pci_set_power_state(dev, PCI_D0);
-
-done:
+ ret = pci_dev_d3cold_d0_cycle(dev);
pci_dev_reset_iommu_done(dev);
return ret;
}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4a14f88e543a..a9942787de9e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -234,6 +234,7 @@ void pci_init_reset_methods(struct pci_dev *dev);
int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
int pci_bus_error_reset(struct pci_dev *dev);
int pci_try_reset_bridge(struct pci_dev *bridge);
+int pci_dev_d3cold_d0_cycle(struct pci_dev *dev);
struct pci_cap_saved_data {
u16 cap_nr;
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index e49136ac5dbf..70f3b0f26799 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4237,6 +4237,22 @@ static int reset_hinic_vf_dev(struct pci_dev *pdev, bool probe)
return 0;
}
+/*
+ * Device-specific reset method via D3cold/D3hot power cycle.
+ *
+ * Some devices lack working FLR, advertise NoSoftRst+ (blocking PM reset),
+ * and have broken bus reset. This function provides device-specific reset via
+ * power cycling, attempting D3cold with automatic fallback to D3hot on platforms
+ * without ACPI _PR3 power resources.
+ */
+static int reset_d3cold_d3hot(struct pci_dev *dev, bool probe)
+{
+ if (probe)
+ return 0;
+
+ return pci_dev_d3cold_d0_cycle(dev);
+}
+
static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
reset_intel_82599_sfp_virtfn },
@@ -4252,6 +4268,9 @@ static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
reset_chelsio_generic_dev },
{ PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HINIC_VF,
reset_hinic_vf_dev },
+ { PCI_VENDOR_ID_QCOM, 0x1103, reset_d3cold_d3hot }, /* WCN6855 */
+ { PCI_VENDOR_ID_QCOM, 0x1107, reset_d3cold_d3hot }, /* WCN7850 */
+ { PCI_VENDOR_ID_QCOM, 0x0308, reset_d3cold_d3hot }, /* SDX62/SDX65 */
{ 0 }
};
--
2.54.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v7 3/3] PCI: Disable broken bus reset on Qualcomm devices
2026-06-03 10:58 [PATCH v7 0/3] PCI: Add d3cold and device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
2026-06-03 10:58 ` [PATCH v7 1/3] PCI: Add d3cold as general reset method Jose Ignacio Tornos Martinez
2026-06-03 10:58 ` [PATCH v6 2/3] PCI: Add device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
@ 2026-06-03 10:58 ` Jose Ignacio Tornos Martinez
2 siblings, 0 replies; 9+ messages in thread
From: Jose Ignacio Tornos Martinez @ 2026-06-03 10:58 UTC (permalink / raw)
To: bhelgaas, alex
Cc: jjohnson, mani, linux-pci, linux-wireless, ath11k, ath12k, mhi,
linux-kernel, Jose Ignacio Tornos Martinez
Some Qualcomm PCIe devices (WCN6855/WCN7850 WiFi cards, SDX62/SDX65 modems)
do not properly support Secondary Bus Reset (SBR).
Testing confirms this is device-specific, not deployment-specific:
MediaTek MT7925e successfully uses bus reset through the same passive
M.2-to-PCIe adapters where Qualcomm devices fail, proving PERST# is
properly wired through the adapters.
This quirk acts as a safety net, preventing the broken bus reset from being
attempted if users override reset methods (device_specific or d3cold when
available) via sysfs.
Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
---
v7: Address the comments from Jeff Johnson: the problem was detected for
the specific devices WCN6855 and WCN7850, use the name of the devices
instead of the related drivers ath11k and ath12k, which support a large
number of devices
v6: https://lore.kernel.org/all/20260602160024.1171949-4-jtornosm@redhat.com/
drivers/pci/quirks.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 000000000000..111111111111 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3789,6 +3789,9 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0030, quirk_no_bus_reset);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0032, quirk_no_bus_reset);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003c, quirk_no_bus_reset);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0033, quirk_no_bus_reset);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x0034, quirk_no_bus_reset);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATHEROS, 0x003e, quirk_no_bus_reset);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_QCOM, 0x1103, quirk_no_bus_reset); /* WCN6855 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_QCOM, 0x1107, quirk_no_bus_reset); /* WCN7850 */
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_QCOM, 0x0308, quirk_no_bus_reset); /* SDX62/SDX65 */
/*
* Root port on some Cavium CN8xxx chips do not successfully complete a bus
--
2.53.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v7 1/3] PCI: Add d3cold as general reset method
2026-06-03 10:58 ` [PATCH v7 1/3] PCI: Add d3cold as general reset method Jose Ignacio Tornos Martinez
@ 2026-06-03 11:20 ` sashiko-bot
2026-06-10 19:19 ` Bjorn Helgaas
2026-06-10 18:42 ` Bjorn Helgaas
1 sibling, 1 reply; 9+ messages in thread
From: sashiko-bot @ 2026-06-03 11:20 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] pci_d3cold_reset() fails to wait for device readiness after transitioning from D3cold to D0, leading to immediate config read failures and state corruption.
- [High] pci_d3cold_reset() does not prevent execution on single-function bridges, leading to permanent state corruption and physical data loss of all subordinate devices.
--
commit c7a8523521475c131fd83bc43a00b0cc028fc9a1
Author: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
PCI: Add d3cold as general reset method
This patch adds D3cold power cycling as a reset method for single-function
devices via ACPI _PR3, placing it at the end of the reset hierarchy.
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8f7cfcc000901..096868f80cd4c 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
[ ... ]
> +static int pci_d3cold_reset(struct pci_dev *dev, bool probe)
> +{
> + int ret;
> +
> + if (dev->multifunction)
> + return -ENOTTY;
> +
[Severity: High]
Does this allow execution on single-function bridges, such as PCIe Root
Ports?
If pci_set_power_state(dev, PCI_D3cold) is called on a bridge, it recursively
updates all subordinate devices' tracking state to D3cold via
__pci_bus_set_current_state() and physically removes power from them.
The subsequent transition back to D0 only updates the bridge itself. This
leaves all downstream devices tracked as PCI_D3cold indefinitely, and they
lose all their configuration and BARs since the reset framework only saves
and restores the bridge's configuration state.
> + if (!pci_pr3_present(dev))
> + return -ENOTTY;
> +
> + if (probe)
> + return 0;
> +
> + if (dev->current_state != PCI_D0)
> + return -EINVAL;
> +
> + ret = pci_dev_reset_iommu_prepare(dev);
> + if (ret) {
> + pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
> + return ret;
> + }
> +
> + ret = pci_set_power_state(dev, PCI_D3cold);
> + if (ret)
> + goto done;
> +
> + ret = pci_set_power_state(dev, PCI_D0);
[Severity: High]
Does this sequence fail to account for the mandatory wait time before the
device is responsive to config space accesses?
A D3cold to D0 transition acts as a fundamental reset. pci_set_power_state()
internally invokes pci_power_up(), which applies power and immediately
attempts to read PCI_PM_CTRL without waiting for link training or handling
Configuration Request Retry Status (CRS).
Because the device is unready, this read can return 0xFFFF (Unsupported
Request) and cause pci_power_up() to spuriously fail with -EIO. Should this
include a polling delay, similar to pci_dev_wait(), to prevent the reset
framework from blindly failing or restoring state to an uninitialized device?
> +
> +done:
> + pci_dev_reset_iommu_done(dev);
> + return ret;
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603105853.326290-1-jtornosm@redhat.com?part=1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v6 2/3] PCI: Add device-specific reset for Qualcomm devices
2026-06-03 10:58 ` [PATCH v6 2/3] PCI: Add device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
@ 2026-06-03 11:39 ` sashiko-bot
0 siblings, 0 replies; 9+ messages in thread
From: sashiko-bot @ 2026-06-03 11:39 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [High] Missing Device Readiness Wait
- [High] Unprotected State Modifications (IOMMU ATS invalidation on suspended device)
- [Medium] Commit Message Discrepancy (IOMMU preparation)
--
commit d37c19884265543746b79161f5a1ed58d8af78d3
Author: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
PCI: Add device-specific reset for Qualcomm devices
This commit adds a device-specific reset method for certain Qualcomm devices
(WCN6855, WCN7850, SDX62/SDX65) via D3cold/D3hot power cycling. It also
extracts a shared helper function for the power cycle transition.
> Extract a shared pci_dev_d3cold_d0_cycle() helper function to avoid code
> duplication between pci_d3cold_reset() (strict _PR3 requirement) and the
> new reset_d3cold_d3hot() device-specific reset (automaticfallback). The
> helper handles IOMMU preparation, performs the power cycle via
> pci_set_power_state(), and cleans up IOMMU state.
[Severity: Medium]
Is the description of the helper function accurate? The commit message states
that the helper handles IOMMU preparation and cleanup, but the extracted
pci_dev_d3cold_d0_cycle() function does not perform these operations and
explicitly requires the caller to handle them.
Could the commit message be updated to reflect the actual implementation?
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 096868f80cd4c..f7a7443287fdf 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
[ ... ]
> @@ -4491,6 +4491,32 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
> return ret;
> }
>
> +/**
> + * pci_dev_d3cold_d0_cycle - Perform D3cold->D0 power cycle
> + * @dev: Device to power cycle
> + *
> + * Common helper to perform D3cold->D0 power cycle for reset methods.
> + * Attempts D3cold transition with automatic fallback to D3hot on platforms
> + * without ACPI _PR3 power resources.
> + *
> + * Caller must handle IOMMU preparation/cleanup if needed.
> + *
> + * Returns 0 on success, negative error code on failure.
> + */
> +int pci_dev_d3cold_d0_cycle(struct pci_dev *dev)
> +{
> + int ret;
> +
> + if (dev->current_state != PCI_D0)
> + return -EINVAL;
> +
> + ret = pci_set_power_state(dev, PCI_D3cold);
> + if (ret)
> + return ret;
> +
> + return pci_set_power_state(dev, PCI_D0);
[Severity: High]
Does this sequence need to wait for the device to become ready? After
returning to D0 via pci_set_power_state(), the device undergoes a reset,
and the PCI spec requires polling Configuration Request Retry Status (CRS)
before it is ready.
Unlike pci_pm_reset(), which explicitly calls pci_dev_wait(), this helper
returns immediately. Since callers like pci_d3cold_reset() and
reset_d3cold_d3hot() rely on this, will this cause the PCI core to execute
pci_dev_restore() before the device is responsive, potentially leading to
dropped configuration writes?
> +}
> +
> /**
> * pci_d3cold_reset - Put device into D3cold and back to D0 for reset
> * @dev: PCI device to reset
> @@ -4520,22 +4546,13 @@ static int pci_d3cold_reset(struct pci_dev *dev, bool probe)
> if (probe)
> return 0;
>
> - if (dev->current_state != PCI_D0)
> - return -EINVAL;
> -
> ret = pci_dev_reset_iommu_prepare(dev);
[Severity: High]
Can this trigger ATS invalidations on suspended devices?
By moving the power state check inside the pci_dev_d3cold_d0_cycle()
helper, pci_dev_reset_iommu_prepare() is now executed before verifying that
the device is in D0. If the device is currently in D3, interacting with
the IOMMU to detach domains can trigger PCIe ATS invalidations, which may
result in Master Aborts or completion timeouts.
Should the state check remain before IOMMU preparation?
> if (ret) {
> pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
> return ret;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260603105853.326290-1-jtornosm@redhat.com?part=2
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v7 1/3] PCI: Add d3cold as general reset method
2026-06-03 10:58 ` [PATCH v7 1/3] PCI: Add d3cold as general reset method Jose Ignacio Tornos Martinez
2026-06-03 11:20 ` sashiko-bot
@ 2026-06-10 18:42 ` Bjorn Helgaas
1 sibling, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2026-06-10 18:42 UTC (permalink / raw)
To: Jose Ignacio Tornos Martinez
Cc: bhelgaas, alex, jjohnson, mani, linux-pci, linux-wireless, ath11k,
ath12k, mhi, linux-kernel, linux-pm
[+cc linux-pm]
On Wed, Jun 03, 2026 at 12:58:51PM +0200, Jose Ignacio Tornos Martinez wrote:
> Add D3cold power cycle as a general PCI reset method for single-function
> devices on platforms with ACPI _PR3 power resources. This provides true
> power cycle reset capability when the platform can physically cut power
> to the device.
>
> The implementation strictly requires _PR3 to be present - the platform
> must be able to control device power. This ensures d3cold only attempts
> true power cycling, not falling back to D3hot transitions.
>
> D3cold reset is placed at the end of the reset hierarchy since it requires
> specific platform support and should be tried after standard methods.
>
> Reset hierarchy with this change:
> 1. device_specific
> 2. acpi
> 3. flr
> 4. af_flr
> 5. pm (D3hot via config space, checks NoSoftRst)
> 6. bus (SBR)
> 7. cxl_bus
> 8. d3cold (NEW - true power cycle, requires _PR3)
>
> This benefits:
> - Platforms with _PR3 support
> - Single-function devices needing true power cycle
> - VFIO passthrough scenarios where FLR/PM unavailable
I assume you have some specific device(s) in mind here; can we mention
any examples so this is more than just theoretically useful?
> Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> ---
> v7: code unchanged from v6
> v6: https://lore.kernel.org/all/20260602160024.1171949-2-jtornosm@redhat.com/
>
> drivers/pci/pci.c | 50 +++++++++++++++++++++++++++++++++++++++++++++
> include/linux/pci.h | 2 +-
> 2 files changed, 51 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8f7cfcc00090..096868f80cd4 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4491,6 +4491,55 @@ static int pci_pm_reset(struct pci_dev *dev, bool probe)
> return ret;
> }
>
> +/**
> + * pci_d3cold_reset - Put device into D3cold and back to D0 for reset
> + * @dev: PCI device to reset
> + * @probe: if true, check if D3cold reset is supported; if false, perform reset
> + *
> + * Reset the device by transitioning through D3cold (actual power removal via
> + * platform power control) and back to D0. This requires ACPI _PR3 power
> + * resources to be present - the platform must be able to physically cut power
> + * to the device.
> + *
> + * Only available for single-function devices to avoid affecting other
> + * functions in multi-function devices.
> + *
> + * Returns 0 if device can be/was reset this way, -ENOTTY if not supported,
> + * or other negative error code on failure.
> + */
> +static int pci_d3cold_reset(struct pci_dev *dev, bool probe)
> +{
> + int ret;
> +
> + if (dev->multifunction)
> + return -ENOTTY;
> +
> + if (!pci_pr3_present(dev))
> + return -ENOTTY;
platform_pci_set_power_state() is currently only implemented for MID
and ACPI, but we're starting to see DT systems with power control for
devices, so I assume it may someday be extended to support them.
Checking pci_pr3_present() seems wrong to me because it assumes ACPI.
On ACPI systems, it seems like pci_set_power_state(PCI_D3cold) should
return an error if the necessary ACPI methods are not present, and we
wouldn't need this check. Is that not the case?
Or is this _PR3 check here just to support the "probe"? I don't know
if there's a generic interface to tell us whether we can control the
power to a device.
> + if (probe)
> + return 0;
> +
> + if (dev->current_state != PCI_D0)
> + return -EINVAL;
> +
> + ret = pci_dev_reset_iommu_prepare(dev);
> + if (ret) {
> + pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
> + return ret;
> + }
> +
> + ret = pci_set_power_state(dev, PCI_D3cold);
> + if (ret)
> + goto done;
> +
> + ret = pci_set_power_state(dev, PCI_D0);
> +
> +done:
> + pci_dev_reset_iommu_done(dev);
> + return ret;
> +}
> +
> /**
> * pcie_wait_for_link_status - Wait for link status change
> * @pdev: Device whose link to wait for.
> @@ -5065,6 +5114,7 @@ const struct pci_reset_fn_method pci_reset_fn_methods[] = {
> { pci_pm_reset, .name = "pm" },
> { pci_reset_bus_function, .name = "bus" },
> { cxl_reset_bus_function, .name = "cxl_bus" },
> + { pci_d3cold_reset, .name = "d3cold" },
> };
>
> /**
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2c4454583c11..1ca7b880ead7 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -51,7 +51,7 @@
> PCI_STATUS_PARITY)
>
> /* Number of reset methods used in pci_reset_fn_methods array in pci.c */
> -#define PCI_NUM_RESET_METHODS 8
> +#define PCI_NUM_RESET_METHODS 9
>
> #define PCI_RESET_PROBE true
> #define PCI_RESET_DO_RESET false
> --
> 2.54.0
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v7 1/3] PCI: Add d3cold as general reset method
2026-06-03 11:20 ` sashiko-bot
@ 2026-06-10 19:19 ` Bjorn Helgaas
2026-06-10 20:04 ` Rafael J. Wysocki
0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2026-06-10 19:19 UTC (permalink / raw)
To: sashiko-reviews
Cc: Jose Ignacio Tornos Martinez, linux-pci, Rafael J. Wysocki
[+cc Rafael, D3cold->D0 delays]
On Wed, Jun 03, 2026 at 11:20:18AM +0000, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [High] pci_d3cold_reset() fails to wait for device readiness after transitioning from D3cold to D0, leading to immediate config read failures and state corruption.
> - [High] pci_d3cold_reset() does not prevent execution on single-function bridges, leading to permanent state corruption and physical data loss of all subordinate devices.
> --
>
> commit c7a8523521475c131fd83bc43a00b0cc028fc9a1
> Author: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
>
> PCI: Add d3cold as general reset method
>
> This patch adds D3cold power cycling as a reset method for single-function
> devices via ACPI _PR3, placing it at the end of the reset hierarchy.
>
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 8f7cfcc000901..096868f80cd4c 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
>
> [ ... ]
>
> > +static int pci_d3cold_reset(struct pci_dev *dev, bool probe)
> > +{
> > + int ret;
> > +
> > + if (dev->multifunction)
> > + return -ENOTTY;
> > +
>
> [Severity: High]
> Does this allow execution on single-function bridges, such as PCIe Root
> Ports?
>
> If pci_set_power_state(dev, PCI_D3cold) is called on a bridge, it recursively
> updates all subordinate devices' tracking state to D3cold via
> __pci_bus_set_current_state() and physically removes power from them.
>
> The subsequent transition back to D0 only updates the bridge itself. This
> leaves all downstream devices tracked as PCI_D3cold indefinitely, and they
> lose all their configuration and BARs since the reset framework only saves
> and restores the bridge's configuration state.
Secondary Bus Reset has a similar issue: the hot reset on the
secondary side is propagated to downstream devices.
pci_parent_bus_reset() returns -ENOTTY if we try to use SBR to reset a
bridge. I wonder if that check should be moved up to
pci_reset_bus_function() we don't try to use pci_reset_hotplug_slot()
on bridges either.
So I think it *is* a problem if we put a bridge in D3cold because
that's also a reset of any downstream devices.
> > + if (!pci_pr3_present(dev))
> > + return -ENOTTY;
> > +
> > + if (probe)
> > + return 0;
> > +
> > + if (dev->current_state != PCI_D0)
> > + return -EINVAL;
> > +
> > + ret = pci_dev_reset_iommu_prepare(dev);
> > + if (ret) {
> > + pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
> > + return ret;
> > + }
> > +
> > + ret = pci_set_power_state(dev, PCI_D3cold);
> > + if (ret)
> > + goto done;
> > +
> > + ret = pci_set_power_state(dev, PCI_D0);
>
> [Severity: High]
> Does this sequence fail to account for the mandatory wait time before the
> device is responsive to config space accesses?
>
> A D3cold to D0 transition acts as a fundamental reset. pci_set_power_state()
> internally invokes pci_power_up(), which applies power and immediately
> attempts to read PCI_PM_CTRL without waiting for link training or handling
> Configuration Request Retry Status (CRS).
I don't think sashiko can infer this from the code, but my
understanding is that platform_pci_set_power_state(PCI_D0) is
responsible for delays after a D3cold -> D0 transition, so I don't
think this one is actually an issue:
https://lore.kernel.org/all/CAJZ5v0iZN5NtUztqe=MxCRcXdBaaqzZ749OqSUkadwwBy0ugUQ@mail.gmail.com/
> Because the device is unready, this read can return 0xFFFF (Unsupported
> Request) and cause pci_power_up() to spuriously fail with -EIO. Should this
> include a polling delay, similar to pci_dev_wait(), to prevent the reset
> framework from blindly failing or restoring state to an uninitialized device?
>
> > +
> > +done:
> > + pci_dev_reset_iommu_done(dev);
> > + return ret;
> > +}
>
> --
> Sashiko AI review · https://sashiko.dev/#/patchset/20260603105853.326290-1-jtornosm@redhat.com?part=1
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v7 1/3] PCI: Add d3cold as general reset method
2026-06-10 19:19 ` Bjorn Helgaas
@ 2026-06-10 20:04 ` Rafael J. Wysocki
0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2026-06-10 20:04 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: sashiko-reviews, Jose Ignacio Tornos Martinez, linux-pci,
Rafael J. Wysocki
On Wed, Jun 10, 2026 at 9:19 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Rafael, D3cold->D0 delays]
>
> On Wed, Jun 03, 2026 at 11:20:18AM +0000, sashiko-bot@kernel.org wrote:
> > Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> > - [High] pci_d3cold_reset() fails to wait for device readiness after transitioning from D3cold to D0, leading to immediate config read failures and state corruption.
> > - [High] pci_d3cold_reset() does not prevent execution on single-function bridges, leading to permanent state corruption and physical data loss of all subordinate devices.
> > --
> >
> > commit c7a8523521475c131fd83bc43a00b0cc028fc9a1
> > Author: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
> >
> > PCI: Add d3cold as general reset method
> >
> > This patch adds D3cold power cycling as a reset method for single-function
> > devices via ACPI _PR3, placing it at the end of the reset hierarchy.
> >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 8f7cfcc000901..096868f80cd4c 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> >
> > [ ... ]
> >
> > > +static int pci_d3cold_reset(struct pci_dev *dev, bool probe)
> > > +{
> > > + int ret;
> > > +
> > > + if (dev->multifunction)
> > > + return -ENOTTY;
> > > +
> >
> > [Severity: High]
> > Does this allow execution on single-function bridges, such as PCIe Root
> > Ports?
> >
> > If pci_set_power_state(dev, PCI_D3cold) is called on a bridge, it recursively
> > updates all subordinate devices' tracking state to D3cold via
> > __pci_bus_set_current_state() and physically removes power from them.
> >
> > The subsequent transition back to D0 only updates the bridge itself. This
> > leaves all downstream devices tracked as PCI_D3cold indefinitely, and they
> > lose all their configuration and BARs since the reset framework only saves
> > and restores the bridge's configuration state.
>
> Secondary Bus Reset has a similar issue: the hot reset on the
> secondary side is propagated to downstream devices.
> pci_parent_bus_reset() returns -ENOTTY if we try to use SBR to reset a
> bridge. I wonder if that check should be moved up to
> pci_reset_bus_function() we don't try to use pci_reset_hotplug_slot()
> on bridges either.
>
> So I think it *is* a problem if we put a bridge in D3cold because
> that's also a reset of any downstream devices.
It would also be quite confusing for the drivers of the downstream
devices, especially if they support runtime PM and use the reference
counting thereof to ensure no power loss.
It looks like such a reset would require unbinding all of the drivers
from the devices below the bridge.
> > > + if (!pci_pr3_present(dev))
> > > + return -ENOTTY;
> > > +
> > > + if (probe)
> > > + return 0;
> > > +
> > > + if (dev->current_state != PCI_D0)
> > > + return -EINVAL;
> > > +
> > > + ret = pci_dev_reset_iommu_prepare(dev);
> > > + if (ret) {
> > > + pci_err(dev, "failed to stop IOMMU for a PCI reset: %d\n", ret);
> > > + return ret;
> > > + }
> > > +
> > > + ret = pci_set_power_state(dev, PCI_D3cold);
> > > + if (ret)
> > > + goto done;
> > > +
> > > + ret = pci_set_power_state(dev, PCI_D0);
> >
> > [Severity: High]
> > Does this sequence fail to account for the mandatory wait time before the
> > device is responsive to config space accesses?
> >
> > A D3cold to D0 transition acts as a fundamental reset. pci_set_power_state()
> > internally invokes pci_power_up(), which applies power and immediately
> > attempts to read PCI_PM_CTRL without waiting for link training or handling
> > Configuration Request Retry Status (CRS).
>
> I don't think sashiko can infer this from the code, but my
> understanding is that platform_pci_set_power_state(PCI_D0) is
> responsible for delays after a D3cold -> D0 transition, so I don't
> think this one is actually an issue:
>
> https://lore.kernel.org/all/CAJZ5v0iZN5NtUztqe=MxCRcXdBaaqzZ749OqSUkadwwBy0ugUQ@mail.gmail.com/
Right.
> > Because the device is unready, this read can return 0xFFFF (Unsupported
> > Request) and cause pci_power_up() to spuriously fail with -EIO. Should this
> > include a polling delay, similar to pci_dev_wait(), to prevent the reset
> > framework from blindly failing or restoring state to an uninitialized device?
> >
> > > +
> > > +done:
> > > + pci_dev_reset_iommu_done(dev);
> > > + return ret;
> > > +}
> >
> > --
> > Sashiko AI review · https://sashiko.dev/#/patchset/20260603105853.326290-1-jtornosm@redhat.com?part=1
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-06-10 20:04 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-03 10:58 [PATCH v7 0/3] PCI: Add d3cold and device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
2026-06-03 10:58 ` [PATCH v7 1/3] PCI: Add d3cold as general reset method Jose Ignacio Tornos Martinez
2026-06-03 11:20 ` sashiko-bot
2026-06-10 19:19 ` Bjorn Helgaas
2026-06-10 20:04 ` Rafael J. Wysocki
2026-06-10 18:42 ` Bjorn Helgaas
2026-06-03 10:58 ` [PATCH v6 2/3] PCI: Add device-specific reset for Qualcomm devices Jose Ignacio Tornos Martinez
2026-06-03 11:39 ` sashiko-bot
2026-06-03 10:58 ` [PATCH v7 3/3] PCI: Disable broken bus reset on " Jose Ignacio Tornos Martinez
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.