public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH v6 0/4] Add quirks to proceed PME handshake in DWC PM
@ 2025-09-24  7:23 Richard Zhu
  2025-09-24  7:23 ` [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2 Richard Zhu
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Richard Zhu @ 2025-09-24  7:23 UTC (permalink / raw)
  To: frank.li, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam
  Cc: linux-pci, linux-arm-kernel, imx, linux-kernel

Refer to PCIe r6.0, sec 5.2, fig 5-1 Link Power Management State Flow
Diagram. Both L0 and L2/L3 Ready can be transferred to LDn directly.

It's harmless to let dw_pcie_suspend_noirq() proceed suspend after the
PME_Turn_Off is sent out, whatever the LTSSM state is in L2 or L3 after
a recommended 10ms max wait refer to PCIe r6.0, sec 5.3.3.2.1 PME
Synchronization.

The LTSSM states of i.MX6QP PCIe is inaccessible after the PME_Turn_Off
is kicked off. To handle this case, don't poll L2 state and add one max
10ms delay if QUIRK_NOL2POLL_IN_PM flag is existing in suspend.

Main changes in v6:
Refer to Mani' comments.
- Update the commit message of first patch.
- Squash the 6-3 and 6-4 into 6-2 of v5 patch-set.
- Add the Fixes tag, and CC stable list.

Main changes in v5:
- Fix build warning caused by 6-1 patch.
v4:https://lore.kernel.org/imx/20250822084341.3160738-1-hongxing.zhu@nxp.com/

Main changes in v4:
- Add one patch[1/6] to remove the L1SS check during L2 entry.
- Update the code comments of 2/6 patch and commit description of 6/6 patch.
- Add background to 5/6 to describe why skip PME_Turn_off message when no
endpoint device is connected.
v3:https://lore.kernel.org/imx/20250818073205.1412507-1-hongxing.zhu@nxp.com/

Main changes in v3:
- Adjust the patch sequence to avoid the build break.
- Update the commit message.
v2:https://lore.kernel.org/imx/20250618024116.3704579-1-hongxing.zhu@nxp.com/

Main changes in v2:
Add the following two patches.
- Skip PME_Turn_Off message if there is no endpoint connected.
- Don't return error when wait for link up.
v1:https://lore.kernel.org/imx/20250408065221.1941928-1-hongxing.zhu@nxp.com/

[PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the
[PATCH v6 2/4] PCI: dwc: Don't poll L2 if QUIRK_NOL2POLL_IN_PM is
[PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no
[PATCH v6 4/4] PCI: dwc: Don't return error when wait for link up in

drivers/pci/controller/dwc/pci-imx6.c             |  4 ++++
drivers/pci/controller/dwc/pcie-designware-host.c | 62 +++++++++++++++++++++++++++++++++++---------------------------
drivers/pci/controller/dwc/pcie-designware.h      |  4 ++++
3 files changed, 43 insertions(+), 27 deletions(-)



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2
  2025-09-24  7:23 [PATCH v6 0/4] Add quirks to proceed PME handshake in DWC PM Richard Zhu
@ 2025-09-24  7:23 ` Richard Zhu
  2025-09-24 19:44   ` Bjorn Helgaas
  2025-09-24  7:23 ` [PATCH v6 2/4] PCI: dwc: Don't poll L2 if QUIRK_NOL2POLL_IN_PM is existing in suspend Richard Zhu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Richard Zhu @ 2025-09-24  7:23 UTC (permalink / raw)
  To: frank.li, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam
  Cc: linux-pci, linux-arm-kernel, imx, linux-kernel, Richard Zhu,
	stable, Bjorn Helgaas

The ASPM configuration shouldn't leak out here. Remove the L1SS check
during L2 entry.

Cc: stable@vger.kernel.org
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
---
 drivers/pci/controller/dwc/pcie-designware-host.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 952f8594b501..9d46d1f0334b 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -1005,17 +1005,9 @@ static int dw_pcie_pme_turn_off(struct dw_pcie *pci)
 
 int dw_pcie_suspend_noirq(struct dw_pcie *pci)
 {
-	u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
 	u32 val;
 	int ret;
 
-	/*
-	 * If L1SS is supported, then do not put the link into L2 as some
-	 * devices such as NVMe expect low resume latency.
-	 */
-	if (dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKCTL) & PCI_EXP_LNKCTL_ASPM_L1)
-		return 0;
-
 	if (pci->pp.ops->pme_turn_off) {
 		pci->pp.ops->pme_turn_off(&pci->pp);
 	} else {
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 2/4] PCI: dwc: Don't poll L2 if QUIRK_NOL2POLL_IN_PM is existing in suspend
  2025-09-24  7:23 [PATCH v6 0/4] Add quirks to proceed PME handshake in DWC PM Richard Zhu
  2025-09-24  7:23 ` [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2 Richard Zhu
@ 2025-09-24  7:23 ` Richard Zhu
  2025-09-24  7:23 ` [PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no endpoint connected Richard Zhu
  2025-09-24  7:23 ` [PATCH v6 4/4] PCI: dwc: Don't return error when wait for link up in dw_pcie_resume_noirq() Richard Zhu
  3 siblings, 0 replies; 11+ messages in thread
From: Richard Zhu @ 2025-09-24  7:23 UTC (permalink / raw)
  To: frank.li, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam
  Cc: linux-pci, linux-arm-kernel, imx, linux-kernel, Richard Zhu,
	stable, Frank Li

Refer to PCIe r6.0, sec 5.2, fig 5-1 Link Power Management State Flow
Diagram. Both L0 and L2/L3 Ready can be transferred to LDn directly.

It's harmless to let dw_pcie_suspend_noirq() proceed suspend after the
PME_Turn_Off is sent out, whatever the LTSSM state is in L2 or L3 after
a recommended 10ms max wait refer to PCIe r6.0, sec 5.3.3.2.1 PME
Synchronization.

The LTSSM states are inaccessible on i.MX6QP and i.MX7D after the
PME_Turn_Off is sent out.

To support this case, don't poll L2 state and apply a simple delay of
PCIE_PME_TO_L2_TIMEOUT_US(10ms) if the QUIRK_NOL2POLL_IN_PM flag is set
in suspend.

Cc: stable@vger.kernel.org
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method")
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
 drivers/pci/controller/dwc/pci-imx6.c         |  4 +++
 .../pci/controller/dwc/pcie-designware-host.c | 34 +++++++++++++------
 drivers/pci/controller/dwc/pcie-designware.h  |  4 +++
 3 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/controller/dwc/pci-imx6.c b/drivers/pci/controller/dwc/pci-imx6.c
index 80e48746bbaf..a59b5282c3cc 100644
--- a/drivers/pci/controller/dwc/pci-imx6.c
+++ b/drivers/pci/controller/dwc/pci-imx6.c
@@ -125,6 +125,7 @@ struct imx_pcie_drvdata {
 	enum imx_pcie_variants variant;
 	enum dw_pcie_device_mode mode;
 	u32 flags;
+	u32 quirk;
 	int dbi_length;
 	const char *gpr;
 	const u32 ltssm_off;
@@ -1765,6 +1766,7 @@ static int imx_pcie_probe(struct platform_device *pdev)
 	if (ret)
 		return ret;
 
+	pci->quirk_flag = imx_pcie->drvdata->quirk;
 	pci->use_parent_dt_ranges = true;
 	if (imx_pcie->drvdata->mode == DW_PCIE_EP_TYPE) {
 		ret = imx_add_pcie_ep(imx_pcie, pdev);
@@ -1849,6 +1851,7 @@ static const struct imx_pcie_drvdata drvdata[] = {
 		.enable_ref_clk = imx6q_pcie_enable_ref_clk,
 		.core_reset = imx6qp_pcie_core_reset,
 		.ops = &imx_pcie_host_ops,
+		.quirk = QUIRK_NOL2POLL_IN_PM,
 	},
 	[IMX7D] = {
 		.variant = IMX7D,
@@ -1860,6 +1863,7 @@ static const struct imx_pcie_drvdata drvdata[] = {
 		.mode_mask[0] = IMX6Q_GPR12_DEVICE_TYPE,
 		.enable_ref_clk = imx7d_pcie_enable_ref_clk,
 		.core_reset = imx7d_pcie_core_reset,
+		.quirk = QUIRK_NOL2POLL_IN_PM,
 	},
 	[IMX8MQ] = {
 		.variant = IMX8MQ,
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 9d46d1f0334b..57a1ba08c427 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -1016,15 +1016,29 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
 			return ret;
 	}
 
-	ret = read_poll_timeout(dw_pcie_get_ltssm, val,
-				val == DW_PCIE_LTSSM_L2_IDLE ||
-				val <= DW_PCIE_LTSSM_DETECT_WAIT,
-				PCIE_PME_TO_L2_TIMEOUT_US/10,
-				PCIE_PME_TO_L2_TIMEOUT_US, false, pci);
-	if (ret) {
-		/* Only log message when LTSSM isn't in DETECT or POLL */
-		dev_err(pci->dev, "Timeout waiting for L2 entry! LTSSM: 0x%x\n", val);
-		return ret;
+	if (dwc_quirk(pci, QUIRK_NOL2POLL_IN_PM)) {
+		/*
+		 * Add the QUIRK_NOL2_POLL_IN_PM case to avoid the read hang,
+		 * when LTSSM is not powered in L2/L3/LDn properly.
+		 *
+		 * Refer to PCIe r6.0, sec 5.2, fig 5-1 Link Power Management
+		 * State Flow Diagram. Both L0 and L2/L3 Ready can be
+		 * transferred to LDn directly. On the LTSSM states poll broken
+		 * platforms, add a max 10ms delay refer to PCIe r6.0,
+		 * sec 5.3.3.2.1 PME Synchronization.
+		 */
+		mdelay(PCIE_PME_TO_L2_TIMEOUT_US/1000);
+	} else {
+		ret = read_poll_timeout(dw_pcie_get_ltssm, val,
+					val == DW_PCIE_LTSSM_L2_IDLE ||
+					val <= DW_PCIE_LTSSM_DETECT_WAIT,
+					PCIE_PME_TO_L2_TIMEOUT_US/10,
+					PCIE_PME_TO_L2_TIMEOUT_US, false, pci);
+		if (ret) {
+			/* Only log message when LTSSM isn't in DETECT or POLL */
+			dev_err(pci->dev, "Timeout waiting for L2 entry! LTSSM: 0x%x\n", val);
+			return ret;
+		}
 	}
 
 	/*
@@ -1040,7 +1054,7 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
 
 	pci->suspended = true;
 
-	return ret;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(dw_pcie_suspend_noirq);
 
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 00f52d472dcd..4e5bf6cb6ce8 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -295,6 +295,9 @@
 /* Default eDMA LLP memory size */
 #define DMA_LLP_MEM_SIZE		PAGE_SIZE
 
+#define QUIRK_NOL2POLL_IN_PM		BIT(0)
+#define dwc_quirk(pci, val)		(pci->quirk_flag & val)
+
 struct dw_pcie;
 struct dw_pcie_rp;
 struct dw_pcie_ep;
@@ -504,6 +507,7 @@ struct dw_pcie {
 	const struct dw_pcie_ops *ops;
 	u32			version;
 	u32			type;
+	u32			quirk_flag;
 	unsigned long		caps;
 	int			num_lanes;
 	int			max_link_speed;
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no endpoint connected
  2025-09-24  7:23 [PATCH v6 0/4] Add quirks to proceed PME handshake in DWC PM Richard Zhu
  2025-09-24  7:23 ` [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2 Richard Zhu
  2025-09-24  7:23 ` [PATCH v6 2/4] PCI: dwc: Don't poll L2 if QUIRK_NOL2POLL_IN_PM is existing in suspend Richard Zhu
@ 2025-09-24  7:23 ` Richard Zhu
  2025-09-24 19:59   ` Bjorn Helgaas
  2025-09-24  7:23 ` [PATCH v6 4/4] PCI: dwc: Don't return error when wait for link up in dw_pcie_resume_noirq() Richard Zhu
  3 siblings, 1 reply; 11+ messages in thread
From: Richard Zhu @ 2025-09-24  7:23 UTC (permalink / raw)
  To: frank.li, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam
  Cc: linux-pci, linux-arm-kernel, imx, linux-kernel, Richard Zhu,
	stable, Frank Li

A chip freeze is observed on i.MX7D when PCIe RC kicks off the PM_PME
message and no any devices are connected on the port.

To workaroud such kind of issue, skip PME_Turn_Off message if there is
no endpoint connected.

Cc: stable@vger.kernel.org
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method")
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
 drivers/pci/controller/dwc/pcie-designware-host.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 57a1ba08c427..b303a74b0fd7 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -1008,12 +1008,15 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
 	u32 val;
 	int ret;
 
-	if (pci->pp.ops->pme_turn_off) {
-		pci->pp.ops->pme_turn_off(&pci->pp);
-	} else {
-		ret = dw_pcie_pme_turn_off(pci);
-		if (ret)
-			return ret;
+	/* Skip PME_Turn_Off message if there is no endpoint connected */
+	if (dw_pcie_get_ltssm(pci) > DW_PCIE_LTSSM_DETECT_WAIT) {
+		if (pci->pp.ops->pme_turn_off) {
+			pci->pp.ops->pme_turn_off(&pci->pp);
+		} else {
+			ret = dw_pcie_pme_turn_off(pci);
+			if (ret)
+				return ret;
+		}
 	}
 
 	if (dwc_quirk(pci, QUIRK_NOL2POLL_IN_PM)) {
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v6 4/4] PCI: dwc: Don't return error when wait for link up in dw_pcie_resume_noirq()
  2025-09-24  7:23 [PATCH v6 0/4] Add quirks to proceed PME handshake in DWC PM Richard Zhu
                   ` (2 preceding siblings ...)
  2025-09-24  7:23 ` [PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no endpoint connected Richard Zhu
@ 2025-09-24  7:23 ` Richard Zhu
  3 siblings, 0 replies; 11+ messages in thread
From: Richard Zhu @ 2025-09-24  7:23 UTC (permalink / raw)
  To: frank.li, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam
  Cc: linux-pci, linux-arm-kernel, imx, linux-kernel, Richard Zhu,
	stable, Frank Li

When waiting for the PCIe link to come up, both link up and link down
are valid results depending on the device state.

Since the link may come up later and to get rid of the following
mis-reported PM errors. Do not return an -ETIMEDOUT error, as the
outcome has already been reported in dw_pcie_wait_for_link().

PM error logs introduced by the -ETIMEDOUT error return.
imx6q-pcie 33800000.pcie: Phy link never came up
imx6q-pcie 33800000.pcie: PM: dpm_run_callback(): genpd_resume_noirq returns -110
imx6q-pcie 33800000.pcie: PM: failed to resume noirq: error -110

Cc: stable@vger.kernel.org
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
---
 drivers/pci/controller/dwc/pcie-designware-host.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index b303a74b0fd7..c4386be38a07 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -1084,10 +1084,9 @@ int dw_pcie_resume_noirq(struct dw_pcie *pci)
 	if (ret)
 		return ret;
 
-	ret = dw_pcie_wait_for_link(pci);
-	if (ret)
-		return ret;
+	/* Ignore errors, the link may come up later */
+	dw_pcie_wait_for_link(pci);
 
-	return ret;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(dw_pcie_resume_noirq);
-- 
2.37.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2
  2025-09-24  7:23 ` [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2 Richard Zhu
@ 2025-09-24 19:44   ` Bjorn Helgaas
  2025-09-24 20:59     ` Frank Li
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2025-09-24 19:44 UTC (permalink / raw)
  To: Richard Zhu
  Cc: frank.li, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam, linux-pci,
	linux-arm-kernel, imx, linux-kernel, stable

On Wed, Sep 24, 2025 at 03:23:21PM +0800, Richard Zhu wrote:
> The ASPM configuration shouldn't leak out here. Remove the L1SS check
> during L2 entry.

I'm all in favor of removing this code if possible, but we need to
explain why this is safe.  The L1SS check was added for some reason,
and we need to explain why that reason doesn't apply.

> Cc: stable@vger.kernel.org
> Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
> Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
> ---
>  drivers/pci/controller/dwc/pcie-designware-host.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 952f8594b501..9d46d1f0334b 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -1005,17 +1005,9 @@ static int dw_pcie_pme_turn_off(struct dw_pcie *pci)
>  
>  int dw_pcie_suspend_noirq(struct dw_pcie *pci)
>  {
> -	u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
>  	u32 val;
>  	int ret;
>  
> -	/*
> -	 * If L1SS is supported, then do not put the link into L2 as some
> -	 * devices such as NVMe expect low resume latency.
> -	 */
> -	if (dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKCTL) & PCI_EXP_LNKCTL_ASPM_L1)
> -		return 0;
> -
>  	if (pci->pp.ops->pme_turn_off) {
>  		pci->pp.ops->pme_turn_off(&pci->pp);
>  	} else {
> -- 
> 2.37.1
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no endpoint connected
  2025-09-24  7:23 ` [PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no endpoint connected Richard Zhu
@ 2025-09-24 19:59   ` Bjorn Helgaas
  2025-09-26  6:09     ` Hongxing Zhu
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2025-09-24 19:59 UTC (permalink / raw)
  To: Richard Zhu
  Cc: frank.li, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam, linux-pci,
	linux-arm-kernel, imx, linux-kernel, stable

On Wed, Sep 24, 2025 at 03:23:23PM +0800, Richard Zhu wrote:
> A chip freeze is observed on i.MX7D when PCIe RC kicks off the PM_PME
> message and no any devices are connected on the port.

s/no any/no/

> To workaroud such kind of issue, skip PME_Turn_Off message if there is
> no endpoint connected.

s/workaroud/work around/

> Cc: stable@vger.kernel.org
> Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
> Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method")
> Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
> Reviewed-by: Frank Li <Frank.Li@nxp.com>
> ---
>  drivers/pci/controller/dwc/pcie-designware-host.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 57a1ba08c427..b303a74b0fd7 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -1008,12 +1008,15 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
>  	u32 val;
>  	int ret;
>  
> -	if (pci->pp.ops->pme_turn_off) {
> -		pci->pp.ops->pme_turn_off(&pci->pp);
> -	} else {
> -		ret = dw_pcie_pme_turn_off(pci);
> -		if (ret)
> -			return ret;
> +	/* Skip PME_Turn_Off message if there is no endpoint connected */
> +	if (dw_pcie_get_ltssm(pci) > DW_PCIE_LTSSM_DETECT_WAIT) {

This looks racy and it sounds like this is a workaround for an i.MX7D
defect.  Should it be some kind of quirk just for i.MX7D?

> +		if (pci->pp.ops->pme_turn_off) {
> +			pci->pp.ops->pme_turn_off(&pci->pp);
> +		} else {
> +			ret = dw_pcie_pme_turn_off(pci);
> +			if (ret)
> +				return ret;
> +		}
>  	}
>  
>  	if (dwc_quirk(pci, QUIRK_NOL2POLL_IN_PM)) {
> -- 
> 2.37.1
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2
  2025-09-24 19:44   ` Bjorn Helgaas
@ 2025-09-24 20:59     ` Frank Li
  2025-09-24 22:06       ` Bjorn Helgaas
  0 siblings, 1 reply; 11+ messages in thread
From: Frank Li @ 2025-09-24 20:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Richard Zhu, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam, linux-pci,
	linux-arm-kernel, imx, linux-kernel, stable

On Wed, Sep 24, 2025 at 02:44:57PM -0500, Bjorn Helgaas wrote:
> On Wed, Sep 24, 2025 at 03:23:21PM +0800, Richard Zhu wrote:
> > The ASPM configuration shouldn't leak out here. Remove the L1SS check
> > during L2 entry.
>
> I'm all in favor of removing this code if possible, but we need to
> explain why this is safe.  The L1SS check was added for some reason,
> and we need to explain why that reason doesn't apply.

That's original discussion
https://lore.kernel.org/linux-pci/20230720160738.GC48270@thinkpad/

"To be precise, NVMe driver will shutdown the device if there is no ASPM support
and keep it in low power mode otherwise (there are other cases as well but we do
not need to worry).

But here you are not checking for ASPM state in the suspend path, and just
forcing the link to be in L2/L3 (thereby D3Cold) even though NVMe driver may
expect it to be in low power state like ASPM/APST.

So you should only put the link to L2/L3 if there is no ASPM support. Otherwise,
you'll ending up with bug reports when users connect NVMe to it.

- Mani"

Frank


>
> > Cc: stable@vger.kernel.org
> > Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
> > Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> > Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
> > ---
> >  drivers/pci/controller/dwc/pcie-designware-host.c | 8 --------
> >  1 file changed, 8 deletions(-)
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index 952f8594b501..9d46d1f0334b 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -1005,17 +1005,9 @@ static int dw_pcie_pme_turn_off(struct dw_pcie *pci)
> >
> >  int dw_pcie_suspend_noirq(struct dw_pcie *pci)
> >  {
> > -	u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> >  	u32 val;
> >  	int ret;
> >
> > -	/*
> > -	 * If L1SS is supported, then do not put the link into L2 as some
> > -	 * devices such as NVMe expect low resume latency.
> > -	 */
> > -	if (dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKCTL) & PCI_EXP_LNKCTL_ASPM_L1)
> > -		return 0;
> > -
> >  	if (pci->pp.ops->pme_turn_off) {
> >  		pci->pp.ops->pme_turn_off(&pci->pp);
> >  	} else {
> > --
> > 2.37.1
> >


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2
  2025-09-24 20:59     ` Frank Li
@ 2025-09-24 22:06       ` Bjorn Helgaas
  2025-09-25 16:03         ` Frank Li
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2025-09-24 22:06 UTC (permalink / raw)
  To: Frank Li
  Cc: Richard Zhu, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam, linux-pci,
	linux-arm-kernel, imx, linux-kernel, stable

On Wed, Sep 24, 2025 at 04:59:11PM -0400, Frank Li wrote:
> On Wed, Sep 24, 2025 at 02:44:57PM -0500, Bjorn Helgaas wrote:
> > On Wed, Sep 24, 2025 at 03:23:21PM +0800, Richard Zhu wrote:
> > > The ASPM configuration shouldn't leak out here. Remove the L1SS check
> > > during L2 entry.
> >
> > I'm all in favor of removing this code if possible, but we need to
> > explain why this is safe.  The L1SS check was added for some reason,
> > and we need to explain why that reason doesn't apply.
> 
> That's original discussion
> https://lore.kernel.org/linux-pci/20230720160738.GC48270@thinkpad/
> 
> "To be precise, NVMe driver will shutdown the device if there is no
> ASPM support and keep it in low power mode otherwise (there are
> other cases as well but we do not need to worry).
> 
> But here you are not checking for ASPM state in the suspend path,
> and just forcing the link to be in L2/L3 (thereby D3Cold) even
> though NVMe driver may expect it to be in low power state like
> ASPM/APST.
> 
> So you should only put the link to L2/L3 if there is no ASPM
> support. Otherwise, you'll ending up with bug reports when users
> connect NVMe to it.
> 
> - Mani"

Whatever the reasoning is, it needs to be in the commit log.  The
above might be leading to the reasoning, but it would need a lot more
dots to be connected to be persuasive.

If NVMe is making assumptions about the ASPM configuration, there
needs to be some generic way to keep track of that.  E.g., if NVMe
doesn't work correctly with some ASPM states, maybe it shouldn't
advertise support for those states.  Hacking up every host controller
driver doesn't seem like a viable approach.

> > > Cc: stable@vger.kernel.org
> > > Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
> > > Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> > > Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
> > > ---
> > >  drivers/pci/controller/dwc/pcie-designware-host.c | 8 --------
> > >  1 file changed, 8 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > index 952f8594b501..9d46d1f0334b 100644
> > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > @@ -1005,17 +1005,9 @@ static int dw_pcie_pme_turn_off(struct dw_pcie *pci)
> > >
> > >  int dw_pcie_suspend_noirq(struct dw_pcie *pci)
> > >  {
> > > -	u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> > >  	u32 val;
> > >  	int ret;
> > >
> > > -	/*
> > > -	 * If L1SS is supported, then do not put the link into L2 as some
> > > -	 * devices such as NVMe expect low resume latency.
> > > -	 */
> > > -	if (dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKCTL) & PCI_EXP_LNKCTL_ASPM_L1)
> > > -		return 0;
> > > -
> > >  	if (pci->pp.ops->pme_turn_off) {
> > >  		pci->pp.ops->pme_turn_off(&pci->pp);
> > >  	} else {
> > > --
> > > 2.37.1
> > >


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2
  2025-09-24 22:06       ` Bjorn Helgaas
@ 2025-09-25 16:03         ` Frank Li
  0 siblings, 0 replies; 11+ messages in thread
From: Frank Li @ 2025-09-25 16:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Richard Zhu, jingoohan1, l.stach, lpieralisi, kwilczynski, mani,
	robh, bhelgaas, shawnguo, s.hauer, kernel, festevam, linux-pci,
	linux-arm-kernel, imx, linux-kernel, stable

On Wed, Sep 24, 2025 at 05:06:05PM -0500, Bjorn Helgaas wrote:
> On Wed, Sep 24, 2025 at 04:59:11PM -0400, Frank Li wrote:
> > On Wed, Sep 24, 2025 at 02:44:57PM -0500, Bjorn Helgaas wrote:
> > > On Wed, Sep 24, 2025 at 03:23:21PM +0800, Richard Zhu wrote:
> > > > The ASPM configuration shouldn't leak out here. Remove the L1SS check
> > > > during L2 entry.
> > >
> > > I'm all in favor of removing this code if possible, but we need to
> > > explain why this is safe.  The L1SS check was added for some reason,
> > > and we need to explain why that reason doesn't apply.
> >
> > That's original discussion
> > https://lore.kernel.org/linux-pci/20230720160738.GC48270@thinkpad/
> >
> > "To be precise, NVMe driver will shutdown the device if there is no
> > ASPM support and keep it in low power mode otherwise (there are
> > other cases as well but we do not need to worry).
> >
> > But here you are not checking for ASPM state in the suspend path,
> > and just forcing the link to be in L2/L3 (thereby D3Cold) even
> > though NVMe driver may expect it to be in low power state like
> > ASPM/APST.
> >
> > So you should only put the link to L2/L3 if there is no ASPM
> > support. Otherwise, you'll ending up with bug reports when users
> > connect NVMe to it.
> >
> > - Mani"
>
> Whatever the reasoning is, it needs to be in the commit log.  The
> above might be leading to the reasoning, but it would need a lot more
> dots to be connected to be persuasive.
>
> If NVMe is making assumptions about the ASPM configuration, there
> needs to be some generic way to keep track of that.  E.g., if NVMe
> doesn't work correctly with some ASPM states, maybe it shouldn't
> advertise support for those states.  Hacking up every host controller
> driver doesn't seem like a viable approach.

Manivannan Sadhasivam:

	Does above situation still exist at current kernel?

Frank

>
> > > > Cc: stable@vger.kernel.org
> > > > Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
> > > > Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
> > > > Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
> > > > ---
> > > >  drivers/pci/controller/dwc/pcie-designware-host.c | 8 --------
> > > >  1 file changed, 8 deletions(-)
> > > >
> > > > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > index 952f8594b501..9d46d1f0334b 100644
> > > > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > > > @@ -1005,17 +1005,9 @@ static int dw_pcie_pme_turn_off(struct dw_pcie *pci)
> > > >
> > > >  int dw_pcie_suspend_noirq(struct dw_pcie *pci)
> > > >  {
> > > > -	u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
> > > >  	u32 val;
> > > >  	int ret;
> > > >
> > > > -	/*
> > > > -	 * If L1SS is supported, then do not put the link into L2 as some
> > > > -	 * devices such as NVMe expect low resume latency.
> > > > -	 */
> > > > -	if (dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKCTL) & PCI_EXP_LNKCTL_ASPM_L1)
> > > > -		return 0;
> > > > -
> > > >  	if (pci->pp.ops->pme_turn_off) {
> > > >  		pci->pp.ops->pme_turn_off(&pci->pp);
> > > >  	} else {
> > > > --
> > > > 2.37.1
> > > >


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no endpoint connected
  2025-09-24 19:59   ` Bjorn Helgaas
@ 2025-09-26  6:09     ` Hongxing Zhu
  0 siblings, 0 replies; 11+ messages in thread
From: Hongxing Zhu @ 2025-09-26  6:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Frank Li, jingoohan1@gmail.com, l.stach@pengutronix.de,
	lpieralisi@kernel.org, kwilczynski@kernel.org, mani@kernel.org,
	robh@kernel.org, bhelgaas@google.com, shawnguo@kernel.org,
	s.hauer@pengutronix.de, kernel@pengutronix.de, festevam@gmail.com,
	linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	imx@lists.linux.dev, linux-kernel@vger.kernel.org,
	stable@vger.kernel.org

> -----Original Message-----
> From: Bjorn Helgaas <helgaas@kernel.org>
> Sent: 2025年9月25日 4:00
> To: Hongxing Zhu <hongxing.zhu@nxp.com>
> Cc: Frank Li <frank.li@nxp.com>; jingoohan1@gmail.com;
> l.stach@pengutronix.de; lpieralisi@kernel.org; kwilczynski@kernel.org;
> mani@kernel.org; robh@kernel.org; bhelgaas@google.com;
> shawnguo@kernel.org; s.hauer@pengutronix.de; kernel@pengutronix.de;
> festevam@gmail.com; linux-pci@vger.kernel.org;
> linux-arm-kernel@lists.infradead.org; imx@lists.linux.dev;
> linux-kernel@vger.kernel.org; stable@vger.kernel.org
> Subject: Re: [PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no
> endpoint connected
> 
> On Wed, Sep 24, 2025 at 03:23:23PM +0800, Richard Zhu wrote:
> > A chip freeze is observed on i.MX7D when PCIe RC kicks off the PM_PME
> > message and no any devices are connected on the port.
> 
> s/no any/no/
> 
> > To workaroud such kind of issue, skip PME_Turn_Off message if there is
> > no endpoint connected.
> 
> s/workaroud/work around/
> 
> > Cc: stable@vger.kernel.org
> > Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume
> > functionality")
> > Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume
> > method")
> > Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
> > Reviewed-by: Frank Li <Frank.Li@nxp.com>
> > ---
> >  drivers/pci/controller/dwc/pcie-designware-host.c | 15
> > +++++++++------
> >  1 file changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c
> > b/drivers/pci/controller/dwc/pcie-designware-host.c
> > index 57a1ba08c427..b303a74b0fd7 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> > @@ -1008,12 +1008,15 @@ int dw_pcie_suspend_noirq(struct dw_pcie *pci)
> >  	u32 val;
> >  	int ret;
> >
> > -	if (pci->pp.ops->pme_turn_off) {
> > -		pci->pp.ops->pme_turn_off(&pci->pp);
> > -	} else {
> > -		ret = dw_pcie_pme_turn_off(pci);
> > -		if (ret)
> > -			return ret;
> > +	/* Skip PME_Turn_Off message if there is no endpoint connected */
> > +	if (dw_pcie_get_ltssm(pci) > DW_PCIE_LTSSM_DETECT_WAIT) {
> 
> This looks racy and it sounds like this is a workaround for an i.MX7D defect.
> Should it be some kind of quirk just for i.MX7D?
Up to now, it is used to fix the issue encountered by i.MX7D PCIe.
If the LTSSM stat is changed, for example one device is new inserted if Hot-Plug
 is supported. The procedure of PME_Turn_Off kick off would terminated by the
 hot-plug event I think.

Best Regards
Richard Zhu
> 
> > +		if (pci->pp.ops->pme_turn_off) {
> > +			pci->pp.ops->pme_turn_off(&pci->pp);
> > +		} else {
> > +			ret = dw_pcie_pme_turn_off(pci);
> > +			if (ret)
> > +				return ret;
> > +		}
> >  	}
> >
> >  	if (dwc_quirk(pci, QUIRK_NOL2POLL_IN_PM)) {
> > --
> > 2.37.1
> >
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-09-26  6:10 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-24  7:23 [PATCH v6 0/4] Add quirks to proceed PME handshake in DWC PM Richard Zhu
2025-09-24  7:23 ` [PATCH v6 1/4] PCI: dwc: Remove the L1SS check before putting the link into L2 Richard Zhu
2025-09-24 19:44   ` Bjorn Helgaas
2025-09-24 20:59     ` Frank Li
2025-09-24 22:06       ` Bjorn Helgaas
2025-09-25 16:03         ` Frank Li
2025-09-24  7:23 ` [PATCH v6 2/4] PCI: dwc: Don't poll L2 if QUIRK_NOL2POLL_IN_PM is existing in suspend Richard Zhu
2025-09-24  7:23 ` [PATCH v6 3/4] PCI: dwc: Skip PME_Turn_Off message if there is no endpoint connected Richard Zhu
2025-09-24 19:59   ` Bjorn Helgaas
2025-09-26  6:09     ` Hongxing Zhu
2025-09-24  7:23 ` [PATCH v6 4/4] PCI: dwc: Don't return error when wait for link up in dw_pcie_resume_noirq() Richard Zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox