* [PATCH v2 1/4] PCI: Introduce an API to check if RC/platform can retain device context during suspend
2026-05-19 8:11 [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Manivannan Sadhasivam via B4 Relay
@ 2026-05-19 8:11 ` Manivannan Sadhasivam via B4 Relay
2026-05-19 8:11 ` [PATCH v2 2/4] PCI: Indicate context lost if L1ss exit is broken during resume from system suspend Manivannan Sadhasivam via B4 Relay
` (4 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2026-05-19 8:11 UTC (permalink / raw)
To: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg
Cc: linux-pci, linux-kernel, linux-arm-msm, linux-nvme,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Currently, the PCI endpoint drivers like NVMe checks whether the device
context will be retained or not during system suspend, with the help of
pm_suspend_via_firmware() API.
But it is possible that the device context might be lost due to some
platform limitation as well. Having those checks in the endpoint drivers
will not scale and will cause a lot of code duplication.
So introduce an API that acts as a sole point of truth that the endpoint
drivers can rely on to check whether they can expect the device context
to be retained or not.
If the API returns 'false', then the client drivers need to prepare for
context loss by performing actions such as resetting the device, saving
the context, shutting it down etc... If it returns 'true', then the drivers
do not need to perform any special action and can leave the device in
active state.
Right now, this API only incorporates the pm_suspend_via_firmware() check.
But will be extended in the future commits.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/pci.c | 23 +++++++++++++++++++++++
include/linux/pci.h | 7 +++++++
2 files changed, 30 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8f7cfcc00090..38cc5172d259 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -33,6 +33,7 @@
#include <asm/dma.h>
#include <linux/aer.h>
#include <linux/bitfield.h>
+#include <linux/suspend.h>
#include "pci.h"
DEFINE_MUTEX(pci_slot_mutex);
@@ -2899,6 +2900,28 @@ void pci_config_pm_runtime_put(struct pci_dev *pdev)
pm_runtime_put_sync(parent);
}
+/**
+ * pci_suspend_retains_context - Check if the platform can retain the device
+ * context during system suspend
+ * @pdev: PCI device to check
+ *
+ * Returns true if the platform can guarantee to retain the device context,
+ * false otherwise.
+ */
+bool pci_suspend_retains_context(struct pci_dev *pdev)
+{
+ /*
+ * If the platform firmware (like ACPI) is involved at the end of system
+ * suspend, device context may not be retained.
+ */
+ if (pm_suspend_via_firmware())
+ return false;
+
+ /* Assume that the context is retained by default */
+ return true;
+}
+EXPORT_SYMBOL_GPL(pci_suspend_retains_context);
+
static const struct dmi_system_id bridge_d3_blacklist[] = {
#ifdef CONFIG_X86
{
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c4454583c11..f60f9e4e7b39 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -2086,6 +2086,8 @@ pci_release_mem_regions(struct pci_dev *pdev)
pci_select_bars(pdev, IORESOURCE_MEM));
}
+bool pci_suspend_retains_context(struct pci_dev *pdev);
+
#else /* CONFIG_PCI is not enabled */
static inline void pci_set_flags(int flags) { }
@@ -2244,6 +2246,11 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
static inline void pci_free_irq_vectors(struct pci_dev *dev)
{
}
+
+static inline bool pci_suspend_retains_context(struct pci_dev *pdev)
+{
+ return true;
+}
#endif /* CONFIG_PCI */
/* Include architecture-dependent settings and functions */
--
2.48.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH v2 2/4] PCI: Indicate context lost if L1ss exit is broken during resume from system suspend
2026-05-19 8:11 [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Manivannan Sadhasivam via B4 Relay
2026-05-19 8:11 ` [PATCH v2 1/4] PCI: Introduce an API to check if RC/platform can retain device context during suspend Manivannan Sadhasivam via B4 Relay
@ 2026-05-19 8:11 ` Manivannan Sadhasivam via B4 Relay
2026-05-22 23:21 ` Bjorn Helgaas
2026-05-19 8:11 ` [PATCH v2 3/4] PCI: qcom: Indicate broken L1ss exit " Manivannan Sadhasivam via B4 Relay
` (3 subsequent siblings)
5 siblings, 1 reply; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2026-05-19 8:11 UTC (permalink / raw)
To: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg
Cc: linux-pci, linux-kernel, linux-arm-msm, linux-nvme,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
The PCIe spec v7.0, sec 5.5.3.3.1, states that for exiting L1.2 due to an
endpoint asserting CLKREQ# signal, the refclk must be turned on no earlier
than TL10_REFCLK_ON, and within the latency advertised in the LTR message.
This same behavior applies to L1.1 as well.
On some platforms like Qcom, these requirements are satisfied during OS
runtime, but not while resuming from the system suspend. This happens
because the PCIe RC driver may remove all resource votes and turns off the
analog circuitry of PHY during suspend to maximize power savings while
keeping the link in L1ss.
Consequently, when the endpoint asserts CLKREQ# to wake up, the OS must
first resume and the RC driver must restore the PHY and enable the REFCLK.
When this recovery process exceeds the L1ss exit latency time (roughly
L10_REFCLK_ON + T_COMMONMODE), the endpoint may treat it as a fatal
condition and triger Link Down (LDn). If the endpoint device is used to
host the RootFS, it will result in an OS crash. For other endpoints, it
may result in a complete device reset/recovery.
So to indicate this platform limitation to the client drivers, introduce a
new flag 'pci_host_bridge::broken_l1ss_resume' and check it in the
pci_suspend_retains_context() API. If the flag is set by the RC driver, the
API will return 'false' indicating the client drivers that the device
context may not be retained and the drivers must be prepared for context
loss.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/pci.c | 11 +++++++++++
include/linux/pci.h | 2 ++
2 files changed, 13 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 38cc5172d259..a7d2cb69b42e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2910,6 +2910,8 @@ void pci_config_pm_runtime_put(struct pci_dev *pdev)
*/
bool pci_suspend_retains_context(struct pci_dev *pdev)
{
+ struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
+
/*
* If the platform firmware (like ACPI) is involved at the end of system
* suspend, device context may not be retained.
@@ -2917,6 +2919,15 @@ bool pci_suspend_retains_context(struct pci_dev *pdev)
if (pm_suspend_via_firmware())
return false;
+ /*
+ * Some host bridges power off the PHY to enter deep low-power modes
+ * during system suspend. Exiting L1 PM Substates from this condition
+ * violates strict timing requirements and results in Link Down (LDn).
+ * On such platforms, the endpoint must be prepared for context loss.
+ */
+ if (bridge && bridge->broken_l1ss_resume)
+ return false;
+
/* Assume that the context is retained by default */
return true;
}
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f60f9e4e7b39..1e5b59fa258a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -660,6 +660,8 @@ struct pci_host_bridge {
unsigned int preserve_config:1; /* Preserve FW resource setup */
unsigned int size_windows:1; /* Enable root bus sizing */
unsigned int msi_domain:1; /* Bridge wants MSI domain */
+ unsigned int broken_l1ss_resume:1; /* Resuming from L1ss during
+ system suspend is broken */
/* Resource alignment requirements */
resource_size_t (*align_resource)(struct pci_dev *dev,
--
2.48.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH v2 2/4] PCI: Indicate context lost if L1ss exit is broken during resume from system suspend
2026-05-19 8:11 ` [PATCH v2 2/4] PCI: Indicate context lost if L1ss exit is broken during resume from system suspend Manivannan Sadhasivam via B4 Relay
@ 2026-05-22 23:21 ` Bjorn Helgaas
2026-05-23 9:14 ` Manivannan Sadhasivam
0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2026-05-22 23:21 UTC (permalink / raw)
To: manivannan.sadhasivam
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-pci, linux-kernel,
linux-arm-msm, linux-nvme
On Tue, May 19, 2026 at 01:41:21PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>
> The PCIe spec v7.0, sec 5.5.3.3.1, states that for exiting L1.2 due to an
> endpoint asserting CLKREQ# signal, the refclk must be turned on no earlier
> than TL10_REFCLK_ON, and within the latency advertised in the LTR message.
> This same behavior applies to L1.1 as well.
It sounds like only the "within the latency advertised in the LTR
message" part is relevant in this case, and there's no issue with the
"no earlier than TL10_REFCLK_ON" part?
> On some platforms like Qcom, these requirements are satisfied during OS
> runtime, but not while resuming from the system suspend. This happens
> because the PCIe RC driver may remove all resource votes and turns off the
> analog circuitry of PHY during suspend to maximize power savings while
> keeping the link in L1ss.
>
> Consequently, when the endpoint asserts CLKREQ# to wake up, the OS must
> first resume and the RC driver must restore the PHY and enable the REFCLK.
> When this recovery process exceeds the L1ss exit latency time (roughly
> L10_REFCLK_ON + T_COMMONMODE), the endpoint may treat it as a fatal
> condition and triger Link Down (LDn). If the endpoint device is used to
> host the RootFS, it will result in an OS crash. For other endpoints, it
> may result in a complete device reset/recovery.
s/triger/trigger/
> So to indicate this platform limitation to the client drivers, introduce a
> new flag 'pci_host_bridge::broken_l1ss_resume' and check it in the
> pci_suspend_retains_context() API. If the flag is set by the RC driver, the
> API will return 'false' indicating the client drivers that the device
> context may not be retained and the drivers must be prepared for context
> loss.
Thanks for the details, this makes sense to me now.
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> drivers/pci/pci.c | 11 +++++++++++
> include/linux/pci.h | 2 ++
> 2 files changed, 13 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 38cc5172d259..a7d2cb69b42e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2910,6 +2910,8 @@ void pci_config_pm_runtime_put(struct pci_dev *pdev)
> */
> bool pci_suspend_retains_context(struct pci_dev *pdev)
> {
> + struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
> +
> /*
> * If the platform firmware (like ACPI) is involved at the end of system
> * suspend, device context may not be retained.
> @@ -2917,6 +2919,15 @@ bool pci_suspend_retains_context(struct pci_dev *pdev)
> if (pm_suspend_via_firmware())
> return false;
>
> + /*
> + * Some host bridges power off the PHY to enter deep low-power modes
> + * during system suspend. Exiting L1 PM Substates from this condition
> + * violates strict timing requirements and results in Link Down (LDn).
> + * On such platforms, the endpoint must be prepared for context loss.
> + */
> + if (bridge && bridge->broken_l1ss_resume)
> + return false;
> +
> /* Assume that the context is retained by default */
> return true;
> }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index f60f9e4e7b39..1e5b59fa258a 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -660,6 +660,8 @@ struct pci_host_bridge {
> unsigned int preserve_config:1; /* Preserve FW resource setup */
> unsigned int size_windows:1; /* Enable root bus sizing */
> unsigned int msi_domain:1; /* Bridge wants MSI domain */
> + unsigned int broken_l1ss_resume:1; /* Resuming from L1ss during
> + system suspend is broken */
>
> /* Resource alignment requirements */
> resource_size_t (*align_resource)(struct pci_dev *dev,
>
> --
> 2.48.1
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 2/4] PCI: Indicate context lost if L1ss exit is broken during resume from system suspend
2026-05-22 23:21 ` Bjorn Helgaas
@ 2026-05-23 9:14 ` Manivannan Sadhasivam
0 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam @ 2026-05-23 9:14 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: manivannan.sadhasivam, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-pci, linux-kernel,
linux-arm-msm, linux-nvme
On Fri, May 22, 2026 at 06:21:10PM -0500, Bjorn Helgaas wrote:
> On Tue, May 19, 2026 at 01:41:21PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> >
> > The PCIe spec v7.0, sec 5.5.3.3.1, states that for exiting L1.2 due to an
> > endpoint asserting CLKREQ# signal, the refclk must be turned on no earlier
> > than TL10_REFCLK_ON, and within the latency advertised in the LTR message.
> > This same behavior applies to L1.1 as well.
>
> It sounds like only the "within the latency advertised in the LTR
> message" part is relevant in this case, and there's no issue with the
> "no earlier than TL10_REFCLK_ON" part?
>
Yes, that's true. I took the exerpt from the spec here, but there is no issue
in enabling REFCLK no earlier than TL10_REFCLK_ON.
> > On some platforms like Qcom, these requirements are satisfied during OS
> > runtime, but not while resuming from the system suspend. This happens
> > because the PCIe RC driver may remove all resource votes and turns off the
> > analog circuitry of PHY during suspend to maximize power savings while
> > keeping the link in L1ss.
> >
> > Consequently, when the endpoint asserts CLKREQ# to wake up, the OS must
> > first resume and the RC driver must restore the PHY and enable the REFCLK.
> > When this recovery process exceeds the L1ss exit latency time (roughly
> > L10_REFCLK_ON + T_COMMONMODE), the endpoint may treat it as a fatal
> > condition and triger Link Down (LDn). If the endpoint device is used to
> > host the RootFS, it will result in an OS crash. For other endpoints, it
> > may result in a complete device reset/recovery.
>
> s/triger/trigger/
>
> > So to indicate this platform limitation to the client drivers, introduce a
> > new flag 'pci_host_bridge::broken_l1ss_resume' and check it in the
> > pci_suspend_retains_context() API. If the flag is set by the RC driver, the
> > API will return 'false' indicating the client drivers that the device
> > context may not be retained and the drivers must be prepared for context
> > loss.
>
> Thanks for the details, this makes sense to me now.
>
Since we got an ack from NVMe maintainer, will you be queuing the series for
v7.2? I'd like this series to get soaked in linux-next for some time, though the
impact is very minimal.
- Mani
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> > ---
> > drivers/pci/pci.c | 11 +++++++++++
> > include/linux/pci.h | 2 ++
> > 2 files changed, 13 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 38cc5172d259..a7d2cb69b42e 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -2910,6 +2910,8 @@ void pci_config_pm_runtime_put(struct pci_dev *pdev)
> > */
> > bool pci_suspend_retains_context(struct pci_dev *pdev)
> > {
> > + struct pci_host_bridge *bridge = pci_find_host_bridge(pdev->bus);
> > +
> > /*
> > * If the platform firmware (like ACPI) is involved at the end of system
> > * suspend, device context may not be retained.
> > @@ -2917,6 +2919,15 @@ bool pci_suspend_retains_context(struct pci_dev *pdev)
> > if (pm_suspend_via_firmware())
> > return false;
> >
> > + /*
> > + * Some host bridges power off the PHY to enter deep low-power modes
> > + * during system suspend. Exiting L1 PM Substates from this condition
> > + * violates strict timing requirements and results in Link Down (LDn).
> > + * On such platforms, the endpoint must be prepared for context loss.
> > + */
> > + if (bridge && bridge->broken_l1ss_resume)
> > + return false;
> > +
> > /* Assume that the context is retained by default */
> > return true;
> > }
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index f60f9e4e7b39..1e5b59fa258a 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -660,6 +660,8 @@ struct pci_host_bridge {
> > unsigned int preserve_config:1; /* Preserve FW resource setup */
> > unsigned int size_windows:1; /* Enable root bus sizing */
> > unsigned int msi_domain:1; /* Bridge wants MSI domain */
> > + unsigned int broken_l1ss_resume:1; /* Resuming from L1ss during
> > + system suspend is broken */
> >
> > /* Resource alignment requirements */
> > resource_size_t (*align_resource)(struct pci_dev *dev,
> >
> > --
> > 2.48.1
> >
> >
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 3/4] PCI: qcom: Indicate broken L1ss exit during resume from system suspend
2026-05-19 8:11 [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Manivannan Sadhasivam via B4 Relay
2026-05-19 8:11 ` [PATCH v2 1/4] PCI: Introduce an API to check if RC/platform can retain device context during suspend Manivannan Sadhasivam via B4 Relay
2026-05-19 8:11 ` [PATCH v2 2/4] PCI: Indicate context lost if L1ss exit is broken during resume from system suspend Manivannan Sadhasivam via B4 Relay
@ 2026-05-19 8:11 ` Manivannan Sadhasivam via B4 Relay
2026-05-19 8:11 ` [PATCH v2 4/4] nvme-pci: Use pci_suspend_retains_context() API during suspend Manivannan Sadhasivam via B4 Relay
` (2 subsequent siblings)
5 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2026-05-19 8:11 UTC (permalink / raw)
To: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg
Cc: linux-pci, linux-kernel, linux-arm-msm, linux-nvme,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Qcom PCIe RCs can successfully exit from L1ss during OS runtime. However,
during system suspend, the Qcom PCIe RC driver may remove all resource
votes and turns off the PHY to maximize power savings.
Consequently, when the host is in system suspend with the link in L1ss and
the endpoint asserts CLKREQ#, the OS must first wake up and the RC driver
must restore the PHY and enable the refclk. This recovery process causes
the strict L1ss exit latency time to be exceeded (roughly L10_REFCLK_ON +
T_COMMONMODE). If the RC driver were to retain all votes during suspend,
L1ss exit would succeed without issue, but at the expense of higher power
consumption.
So when the host fails to move the link from L1ss to L0 within the
L10_REFCLK_ON + T_COMMONMODE time, the endpoint may treat it as a fatal
condition and trigger Link Down (LDn) during resume. This LDn can crash the
OS if the endpoint hosts the RootFS, and for other types of devices, it may
result in a full device reset/recovery.
So to ensure that the client drivers can properly handle this scenario, let
them know about this platform limitation by setting the
'pci_host_bridge::broken_l1ss_resume' flag.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/controller/dwc/pcie-qcom.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index af6bf5cce65b..75bb6cb5e35e 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -1368,6 +1368,18 @@ static void qcom_pcie_host_post_init(struct dw_pcie_rp *pp)
struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
struct qcom_pcie *pcie = to_qcom_pcie(pci);
+ /*
+ * During system suspend, the Qcom RC driver may turn off the analog
+ * circuitry of PHY and remove controller votes to save power. If the
+ * link is in L1ss and the endpoint asserts CLKREQ# to exit L1ss, the
+ * time required to wake the system and restore the PHY/refclk will
+ * exceed the strict L1ss exit timing (L10_REFCLK_ON + T_COMMONMODE),
+ * resulting in Link Down (LDn) condition. Set this flag to indicate
+ * this limitation to client drivers so that they can avoid relying on
+ * L1ss during system suspend.
+ */
+ pp->bridge->broken_l1ss_resume = true;
+
if (pcie->cfg->ops->host_post_init)
pcie->cfg->ops->host_post_init(pcie);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH v2 4/4] nvme-pci: Use pci_suspend_retains_context() API during suspend
2026-05-19 8:11 [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Manivannan Sadhasivam via B4 Relay
` (2 preceding siblings ...)
2026-05-19 8:11 ` [PATCH v2 3/4] PCI: qcom: Indicate broken L1ss exit " Manivannan Sadhasivam via B4 Relay
@ 2026-05-19 8:11 ` Manivannan Sadhasivam via B4 Relay
2026-05-19 8:47 ` Christoph Hellwig
2026-05-22 16:04 ` [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Hans Zhang
2026-05-23 11:35 ` Bjorn Helgaas
5 siblings, 1 reply; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2026-05-19 8:11 UTC (permalink / raw)
To: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg
Cc: linux-pci, linux-kernel, linux-arm-msm, linux-nvme,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
The pci_suspend_retains_context() API lets PCI client drivers know if the
platform can retain the device context during suspend. This is decided
based on several factors like:
1. Firmware involvement at the end of suspend
2. Any platform limitation in waking from low power state (L1ss)
And this API might also get extended in the future to cover other platform
specific issues impacting the device low power mode during system suspend.
So use this API instead of checks like pm_suspend_via_firmware(). When this
API returns false, then assume that the platform cannot retain the context
and shutdown the controller. If it returns true, then assume that the
context will be retained and keep the device in low power mode.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/nvme/host/pci.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index db5fc9bf6627..a6664983ce5d 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3915,6 +3915,7 @@ static int nvme_suspend(struct device *dev)
* use host managed nvme power settings for lowest idle power if
* possible. This should have quicker resume latency than a full device
* shutdown. But if the firmware is involved after the suspend or the
+ * platform has any limitation in waking from low power states or the
* device does not support any non-default power states, shut down the
* device fully.
*
@@ -3923,7 +3924,7 @@ static int nvme_suspend(struct device *dev)
* down, so as to allow the platform to achieve its minimum low-power
* state (which may not be possible if the link is up).
*/
- if (pm_suspend_via_firmware() || !ctrl->npss ||
+ if (!pci_suspend_retains_context(pdev) || !ctrl->npss ||
!pcie_aspm_enabled(pdev) ||
(ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND))
return nvme_disable_prepare_reset(ndev, true);
--
2.48.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH v2 4/4] nvme-pci: Use pci_suspend_retains_context() API during suspend
2026-05-19 8:11 ` [PATCH v2 4/4] nvme-pci: Use pci_suspend_retains_context() API during suspend Manivannan Sadhasivam via B4 Relay
@ 2026-05-19 8:47 ` Christoph Hellwig
0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2026-05-19 8:47 UTC (permalink / raw)
To: manivannan.sadhasivam
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-pci, linux-kernel,
linux-arm-msm, linux-nvme
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API
2026-05-19 8:11 [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Manivannan Sadhasivam via B4 Relay
` (3 preceding siblings ...)
2026-05-19 8:11 ` [PATCH v2 4/4] nvme-pci: Use pci_suspend_retains_context() API during suspend Manivannan Sadhasivam via B4 Relay
@ 2026-05-22 16:04 ` Hans Zhang
2026-05-23 13:55 ` Manivannan Sadhasivam
2026-05-23 11:35 ` Bjorn Helgaas
5 siblings, 1 reply; 11+ messages in thread
From: Hans Zhang @ 2026-05-22 16:04 UTC (permalink / raw)
To: manivannan.sadhasivam, Bjorn Helgaas, Manivannan Sadhasivam,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
Keith Busch, Jens Axboe, Christoph Hellwig, Sagi Grimberg
Cc: linux-pci, linux-kernel, linux-arm-msm, linux-nvme
Hi Mani,
We previously discussed a patch. I wonder if you have any memory of it.
I'm not sure if it can solve my problem. As shown below:
https://lore.kernel.org/linux-pci/z4bq25pr35cklwoodz34pnfaopfrtbjwhc6gvbhbsvnwblhxia@frmtb3t3m4nk/
"""
> Hans: Before I added the printk for debugging, it hung here.
>
>
> I added the log output after debugging printk.
>
> Sky1 SOC Root Port driver's suspend function: sky1_pcie_suspend_noirq
> Our hardware is in STR(suspend to ram), and the controller and PHY
will lose
> power.
>
> So in sky1_pcie_suspend_noirq, the AXI,APB clock, etc. of the PCIe
> controller will be turned off. In sky1_pcie_resume_noirq, the PCIe
> controller and PHY will be reinitialized. If suspend does not close
the AXI
> and APB clock, and the AXI is reopened during the resume process, the APB
> clock will cause the reference count of the kernel API to accumulate
> continuously.
>
So this is the actual issue (controller loosing power during system
suspend) and
everything else (ASPM, MSIX write) are all side effects of it.
Yes, this issue is more common with several vendors and we need to come
up with
a generic solution instead of hacking up the client drivers. I'm planning to
work on it in the coming days. Will keep you in the loop.
- Mani
"""
Best regards,
Hans
On 5/19/26 16:11, Manivannan Sadhasivam via B4 Relay wrote:
> Hi all,
>
> This series introduces a new PCI API, pci_suspend_retains_context() to
> let the client drivers know whether they can expect context retention across
> suspend/resume or not and uses it in the NVMe PCI host driver.
>
> This new API is targeted to abstract the PCI power management details away from
> the client drivers. This is needed because client drivers like NVMe make use of
> APIs such as pm_suspend_via_firmware() and decide to keep the device in low
> power mode if this API returns 'false'. But some platforms may have other
> limitations like in the case of Qcom, where if the RC driver removes the PCIe RC
> resource vote to allow the SoC to enter low power mode, it cannot reliably exit
> the L1ss state when the endpoint asserts CLKREQ#. So in this case also, the
> client drivers cannot keep the device in low power state during suspend and
> expect context retention.
>
> And these limitations may just keep adding in the future. Without a unified
> API, the client drivers have to implement their own logic which may cause code
> duplication and may also lead to drivers missing some of the platform
> limitations.
>
> Once this series gets merged, we can extend this API usage to other client
> drivers as well.
>
> Testing
> =======
>
> This series is tested on Qualcomm Hamoa based Lenovo Thinkpad T14s latop with
> NVMe drive.
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> Changes in v2:
> - Renamed the API to pci_suspend_retains_context()
> - Reworded the commit messages to include L10_REFCLK_ON + T_COMMONMODE as the
> L1ss exit latency
> - Rebased on top of v7.1-rc1
>
> ---
> Manivannan Sadhasivam (4):
> PCI: Introduce an API to check if RC/platform can retain device context during suspend
> PCI: Indicate context lost if L1ss exit is broken during resume from system suspend
> PCI: qcom: Indicate broken L1ss exit during resume from system suspend
> nvme-pci: Use pci_suspend_retains_context() API during suspend
>
> drivers/nvme/host/pci.c | 3 ++-
> drivers/pci/controller/dwc/pcie-qcom.c | 12 ++++++++++++
> drivers/pci/pci.c | 34 ++++++++++++++++++++++++++++++++++
> include/linux/pci.h | 9 +++++++++
> 4 files changed, 57 insertions(+), 1 deletion(-)
> ---
> base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
> change-id: 20260414-l1ss-fix-6c9cf2451944
>
> Best regards,
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API
2026-05-22 16:04 ` [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Hans Zhang
@ 2026-05-23 13:55 ` Manivannan Sadhasivam
0 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam @ 2026-05-23 13:55 UTC (permalink / raw)
To: Hans Zhang
Cc: manivannan.sadhasivam, Bjorn Helgaas, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-pci, linux-kernel,
linux-arm-msm, linux-nvme
On Sat, May 23, 2026 at 12:04:02AM +0800, Hans Zhang wrote:
> Hi Mani,
>
> We previously discussed a patch. I wonder if you have any memory of it. I'm
> not sure if it can solve my problem. As shown below:
>
> https://lore.kernel.org/linux-pci/z4bq25pr35cklwoodz34pnfaopfrtbjwhc6gvbhbsvnwblhxia@frmtb3t3m4nk/
>
This series won't address your issue as it workarounds PCIe controller issue in
waking from L1ss.
But below patch which got merged for v7.2 might help you:
https://lore.kernel.org/all/20251231162126.7728-1-manivannan.sadhasivam@oss.qualcomm.com/
I'm not sure if your platform firmware uses PSCI for S2R (Suspend To RAM) or
not. If PSCI is used, above patch will turn off NVMe during S2R. If not using
PSCI and some other firmware mechanism, then the kernel driver interacting with
the firmware should call pm_set_{suspend/resume}_via_firmware() from the S2R
suspend ops as like the above patch to allow NVMe driver to turn off the
controller during S2R.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API
2026-05-19 8:11 [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Manivannan Sadhasivam via B4 Relay
` (4 preceding siblings ...)
2026-05-22 16:04 ` [PATCH v2 0/4] PCI: Introduce pci_suspend_retains_context() API Hans Zhang
@ 2026-05-23 11:35 ` Bjorn Helgaas
5 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2026-05-23 11:35 UTC (permalink / raw)
To: manivannan.sadhasivam
Cc: Bjorn Helgaas, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Keith Busch, Jens Axboe,
Christoph Hellwig, Sagi Grimberg, linux-pci, linux-kernel,
linux-arm-msm, linux-nvme
On Tue, May 19, 2026 at 01:41:19PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> Hi all,
>
> This series introduces a new PCI API, pci_suspend_retains_context() to
> let the client drivers know whether they can expect context retention across
> suspend/resume or not and uses it in the NVMe PCI host driver.
>
> This new API is targeted to abstract the PCI power management details away from
> the client drivers. This is needed because client drivers like NVMe make use of
> APIs such as pm_suspend_via_firmware() and decide to keep the device in low
> power mode if this API returns 'false'. But some platforms may have other
> limitations like in the case of Qcom, where if the RC driver removes the PCIe RC
> resource vote to allow the SoC to enter low power mode, it cannot reliably exit
> the L1ss state when the endpoint asserts CLKREQ#. So in this case also, the
> client drivers cannot keep the device in low power state during suspend and
> expect context retention.
>
> And these limitations may just keep adding in the future. Without a unified
> API, the client drivers have to implement their own logic which may cause code
> duplication and may also lead to drivers missing some of the platform
> limitations.
>
> Once this series gets merged, we can extend this API usage to other client
> drivers as well.
>
> Testing
> =======
>
> This series is tested on Qualcomm Hamoa based Lenovo Thinkpad T14s latop with
> NVMe drive.
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> Changes in v2:
> - Renamed the API to pci_suspend_retains_context()
> - Reworded the commit messages to include L10_REFCLK_ON + T_COMMONMODE as the
> L1ss exit latency
> - Rebased on top of v7.1-rc1
>
> ---
> Manivannan Sadhasivam (4):
> PCI: Introduce an API to check if RC/platform can retain device context during suspend
> PCI: Indicate context lost if L1ss exit is broken during resume from system suspend
> PCI: qcom: Indicate broken L1ss exit during resume from system suspend
> nvme-pci: Use pci_suspend_retains_context() API during suspend
>
> drivers/nvme/host/pci.c | 3 ++-
> drivers/pci/controller/dwc/pcie-qcom.c | 12 ++++++++++++
> drivers/pci/pci.c | 34 ++++++++++++++++++++++++++++++++++
> include/linux/pci.h | 9 +++++++++
> 4 files changed, 57 insertions(+), 1 deletion(-)
Applied to pci/pm for v7.2, thanks, Mani!
^ permalink raw reply [flat|nested] 11+ messages in thread