* [PATCH v3 1/4] PCI: dwc: Return -ENODEV from dw_pcie_wait_for_link() if device is not found
2025-12-30 15:07 [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Manivannan Sadhasivam via B4 Relay
@ 2025-12-30 15:07 ` Manivannan Sadhasivam via B4 Relay
2025-12-30 15:07 ` [PATCH v3 2/4] PCI: dwc: Rename and move ltssm_status_string() to pcie-designware.c Manivannan Sadhasivam via B4 Relay
` (4 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-12-30 15:07 UTC (permalink / raw)
To: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas
Cc: linux-pci, linux-kernel, vincent.guittot, zhangsenchuan,
Shawn Lin, Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
The dw_pcie_wait_for_link() function waits up to 1 second for the PCIe link
to come up and returns -ETIMEDOUT for all failures without distinguishing
cases where no device is present on the bus. But the callers may want to
just skip the failure if the device is not found on the bus and handle
failure for other reasons.
So after timeout, if the LTSSM is in Detect.Quiet or Detect.Active state,
return -ENODEV to indicate the callers that the device is not found on the
bus and return -ETIMEDOUT otherwise.
Also add kernel doc to document the parameter and return values.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/controller/dwc/pcie-designware.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 345365ea97c7..55c1c60f7f8f 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -692,9 +692,16 @@ void dw_pcie_disable_atu(struct dw_pcie *pci, u32 dir, int index)
dw_pcie_writel_atu(pci, dir, index, PCIE_ATU_REGION_CTRL2, 0);
}
+/**
+ * dw_pcie_wait_for_link - Wait for the PCIe link to be up
+ * @pci: DWC instance
+ *
+ * Returns: 0 if link is up, -ENODEV if device is not found, -ETIMEDOUT if the
+ * link fails to come up for other reasons.
+ */
int dw_pcie_wait_for_link(struct dw_pcie *pci)
{
- u32 offset, val;
+ u32 offset, val, ltssm;
int retries;
/* Check if the link is up or not */
@@ -706,6 +713,17 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
}
if (retries >= PCIE_LINK_WAIT_MAX_RETRIES) {
+ /*
+ * If the link is in Detect.Quiet or Detect.Active state, it
+ * indicates that no device is detected.
+ */
+ ltssm = dw_pcie_get_ltssm(pci);
+ if (ltssm == DW_PCIE_LTSSM_DETECT_QUIET ||
+ ltssm == DW_PCIE_LTSSM_DETECT_ACT) {
+ dev_info(pci->dev, "Device not found\n");
+ return -ENODEV;
+ }
+
dev_info(pci->dev, "Phy link never came up\n");
return -ETIMEDOUT;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v3 2/4] PCI: dwc: Rename and move ltssm_status_string() to pcie-designware.c
2025-12-30 15:07 [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Manivannan Sadhasivam via B4 Relay
2025-12-30 15:07 ` [PATCH v3 1/4] PCI: dwc: Return -ENODEV from dw_pcie_wait_for_link() if device is not found Manivannan Sadhasivam via B4 Relay
@ 2025-12-30 15:07 ` Manivannan Sadhasivam via B4 Relay
2025-12-30 15:07 ` [PATCH v3 3/4] PCI: dwc: Rework the error print of dw_pcie_wait_for_link() Manivannan Sadhasivam via B4 Relay
` (3 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-12-30 15:07 UTC (permalink / raw)
To: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas
Cc: linux-pci, linux-kernel, vincent.guittot, zhangsenchuan,
Shawn Lin, Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Rename ltssm_status_string() to dw_pcie_ltssm_status_string() and move it
to the common file pcie-designware.c so that this function could be used
outside of pcie-designware-debugfs.c file.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
.../pci/controller/dwc/pcie-designware-debugfs.c | 54 +---------------------
drivers/pci/controller/dwc/pcie-designware.c | 52 +++++++++++++++++++++
drivers/pci/controller/dwc/pcie-designware.h | 2 +
3 files changed, 55 insertions(+), 53 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware-debugfs.c b/drivers/pci/controller/dwc/pcie-designware-debugfs.c
index df98fee69892..0d1340c9b364 100644
--- a/drivers/pci/controller/dwc/pcie-designware-debugfs.c
+++ b/drivers/pci/controller/dwc/pcie-designware-debugfs.c
@@ -443,65 +443,13 @@ static ssize_t counter_value_read(struct file *file, char __user *buf,
return simple_read_from_buffer(buf, count, ppos, debugfs_buf, pos);
}
-static const char *ltssm_status_string(enum dw_pcie_ltssm ltssm)
-{
- const char *str;
-
- switch (ltssm) {
-#define DW_PCIE_LTSSM_NAME(n) case n: str = #n; break
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DETECT_QUIET);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DETECT_ACT);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_POLL_ACTIVE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_POLL_COMPLIANCE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_POLL_CONFIG);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_PRE_DETECT_QUIET);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DETECT_WAIT);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_LINKWD_START);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_LINKWD_ACEPT);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_LANENUM_WAI);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_LANENUM_ACEPT);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_COMPLETE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_IDLE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_LOCK);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_SPEED);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_RCVRCFG);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_IDLE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L0);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L0S);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L123_SEND_EIDLE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L1_IDLE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L2_IDLE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L2_WAKE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DISABLED_ENTRY);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DISABLED_IDLE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DISABLED);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_LPBK_ENTRY);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_LPBK_ACTIVE);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_LPBK_EXIT);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_LPBK_EXIT_TIMEOUT);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_HOT_RESET_ENTRY);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_HOT_RESET);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_EQ0);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_EQ1);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_EQ2);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_EQ3);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L1_1);
- DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L1_2);
- default:
- str = "DW_PCIE_LTSSM_UNKNOWN";
- break;
- }
-
- return str + strlen("DW_PCIE_LTSSM_");
-}
-
static int ltssm_status_show(struct seq_file *s, void *v)
{
struct dw_pcie *pci = s->private;
enum dw_pcie_ltssm val;
val = dw_pcie_get_ltssm(pci);
- seq_printf(s, "%s (0x%02x)\n", ltssm_status_string(val), val);
+ seq_printf(s, "%s (0x%02x)\n", dw_pcie_ltssm_status_string(val), val);
return 0;
}
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 55c1c60f7f8f..87f2ebc134d6 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -692,6 +692,58 @@ void dw_pcie_disable_atu(struct dw_pcie *pci, u32 dir, int index)
dw_pcie_writel_atu(pci, dir, index, PCIE_ATU_REGION_CTRL2, 0);
}
+const char *dw_pcie_ltssm_status_string(enum dw_pcie_ltssm ltssm)
+{
+ const char *str;
+
+ switch (ltssm) {
+#define DW_PCIE_LTSSM_NAME(n) case n: str = #n; break
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DETECT_QUIET);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DETECT_ACT);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_POLL_ACTIVE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_POLL_COMPLIANCE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_POLL_CONFIG);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_PRE_DETECT_QUIET);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DETECT_WAIT);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_LINKWD_START);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_LINKWD_ACEPT);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_LANENUM_WAI);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_LANENUM_ACEPT);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_COMPLETE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_CFG_IDLE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_LOCK);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_SPEED);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_RCVRCFG);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_IDLE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L0);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L0S);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L123_SEND_EIDLE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L1_IDLE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L2_IDLE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L2_WAKE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DISABLED_ENTRY);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DISABLED_IDLE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_DISABLED);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_LPBK_ENTRY);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_LPBK_ACTIVE);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_LPBK_EXIT);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_LPBK_EXIT_TIMEOUT);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_HOT_RESET_ENTRY);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_HOT_RESET);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_EQ0);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_EQ1);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_EQ2);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_RCVRY_EQ3);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L1_1);
+ DW_PCIE_LTSSM_NAME(DW_PCIE_LTSSM_L1_2);
+ default:
+ str = "DW_PCIE_LTSSM_UNKNOWN";
+ break;
+ }
+
+ return str + strlen("DW_PCIE_LTSSM_");
+}
+
/**
* dw_pcie_wait_for_link - Wait for the PCIe link to be up
* @pci: DWC instance
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index f87c67a7a482..c1def4d9cf62 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -828,6 +828,8 @@ static inline enum dw_pcie_ltssm dw_pcie_get_ltssm(struct dw_pcie *pci)
return (enum dw_pcie_ltssm)FIELD_GET(PORT_LOGIC_LTSSM_STATE_MASK, val);
}
+const char *dw_pcie_ltssm_status_string(enum dw_pcie_ltssm ltssm);
+
#ifdef CONFIG_PCIE_DW_HOST
int dw_pcie_suspend_noirq(struct dw_pcie *pci);
int dw_pcie_resume_noirq(struct dw_pcie *pci);
--
2.48.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v3 3/4] PCI: dwc: Rework the error print of dw_pcie_wait_for_link()
2025-12-30 15:07 [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Manivannan Sadhasivam via B4 Relay
2025-12-30 15:07 ` [PATCH v3 1/4] PCI: dwc: Return -ENODEV from dw_pcie_wait_for_link() if device is not found Manivannan Sadhasivam via B4 Relay
2025-12-30 15:07 ` [PATCH v3 2/4] PCI: dwc: Rename and move ltssm_status_string() to pcie-designware.c Manivannan Sadhasivam via B4 Relay
@ 2025-12-30 15:07 ` Manivannan Sadhasivam via B4 Relay
2025-12-30 15:07 ` [PATCH v3 4/4] PCI: dwc: Only skip the dw_pcie_wait_for_link() failure if it returns -ENODEV Manivannan Sadhasivam via B4 Relay
` (2 subsequent siblings)
5 siblings, 0 replies; 17+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-12-30 15:07 UTC (permalink / raw)
To: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas
Cc: linux-pci, linux-kernel, vincent.guittot, zhangsenchuan,
Shawn Lin, Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
If the link fails to come up even after detecting the device on the bus
i.e., if the LTSSM is not in Detect.Quiet and Detect.Active states, then
dw_pcie_wait_for_link() should log it as an error.
So promote dev_info() to dev_err(), reword the error log to make it clear
and also print the LTSSM state to aid debugging.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/controller/dwc/pcie-designware.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 87f2ebc134d6..c2dfadc53d04 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -776,7 +776,8 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
return -ENODEV;
}
- dev_info(pci->dev, "Phy link never came up\n");
+ dev_err(pci->dev, "Link failed to come up. LTSSM: %s\n",
+ dw_pcie_ltssm_status_string(ltssm));
return -ETIMEDOUT;
}
--
2.48.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* [PATCH v3 4/4] PCI: dwc: Only skip the dw_pcie_wait_for_link() failure if it returns -ENODEV
2025-12-30 15:07 [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Manivannan Sadhasivam via B4 Relay
` (2 preceding siblings ...)
2025-12-30 15:07 ` [PATCH v3 3/4] PCI: dwc: Rework the error print of dw_pcie_wait_for_link() Manivannan Sadhasivam via B4 Relay
@ 2025-12-30 15:07 ` Manivannan Sadhasivam via B4 Relay
2026-01-02 12:01 ` [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Niklas Cassel
2026-01-05 10:04 ` Vincent Guittot
5 siblings, 0 replies; 17+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-12-30 15:07 UTC (permalink / raw)
To: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas
Cc: linux-pci, linux-kernel, vincent.guittot, zhangsenchuan,
Shawn Lin, Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
dw_pcie_wait_for_link() now returns -ENODEV if the device is not found on
the bus and -ETIMEDOUT if the link fails to come up for any other reasons.
And it is incorrect to skip the link up failures other than device not
found. So only skip the failure for device not found case and handle
failure for other reasons.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
---
drivers/pci/controller/dwc/pcie-designware-host.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index fad0cbedefbc..ccde12b85463 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -675,8 +675,10 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
goto err_remove_edma;
}
- /* Ignore errors, the link may come up later */
- dw_pcie_wait_for_link(pci);
+ /* Skip failure if the device is not found as it may show up later */
+ ret = dw_pcie_wait_for_link(pci);
+ if (ret && ret != -ENODEV)
+ goto err_stop_link;
ret = pci_host_probe(bridge);
if (ret)
--
2.48.1
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2025-12-30 15:07 [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Manivannan Sadhasivam via B4 Relay
` (3 preceding siblings ...)
2025-12-30 15:07 ` [PATCH v3 4/4] PCI: dwc: Only skip the dw_pcie_wait_for_link() failure if it returns -ENODEV Manivannan Sadhasivam via B4 Relay
@ 2026-01-02 12:01 ` Niklas Cassel
2026-01-05 11:41 ` Manivannan Sadhasivam
2026-01-21 12:45 ` Shawn Lin
2026-01-05 10:04 ` Vincent Guittot
5 siblings, 2 replies; 17+ messages in thread
From: Niklas Cassel @ 2026-01-02 12:01 UTC (permalink / raw)
To: manivannan.sadhasivam
Cc: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, Shawn Lin, dlemoal
On Tue, Dec 30, 2025 at 08:37:31PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> Hi,
>
> This series reworks the dw_pcie_wait_for_link() API to allow the callers to
> detect the absence of the device on the bus and skip the failure.
>
> Compared to v2, I've reworked the patch 2 to improve the API further and
> dropped the patch 1 that got applied (hence changed the subject). I've also
> modified the error code based on the feedback in v2 to return -ENODEV if device
> is not detected on the bus and -ETIMEDOUT otherwise. This allows the callers to
> skip the failure if device is not detected and handle error for other failure.
>
> Testing
> =======
>
> Tested this series on Rb3Gen2 board without powering on the PCIe switch. Now the
> dw_pcie_wait_for_link() API prints:
>
> qcom-pcie 1c08000.pcie: Device not found
>
> Instead of the previous log:
>
> qcom-pcie 1c08000.pcie: Phy link never came up
Hello Mani,
I really like this series.
However when testing my usual setup with 2 Rock 5B:s, one in EP mode, one
in RC mode, where I usually power on both boards at the same time, but only
after both boards are booted, do I do the configfs write to enable the link
training on EP, and then do a rescan on the RC.
Even with this series, this workflow still works in 8 out of 10 boots.
However, in 2 out of 10 boots I instead got:
[ 2.285827] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_COMPLIANCE
[ 2.286584] rockchip-dw-pcie a40000000.pcie: probe with driver rockchip-dw-pcie failed with error -110
In both cases LTSSM was in POLL_COMPLIANCE.
Considering that things work in 8 out of 10 boots, means that the LTSSM state
was in Detect.Quiet or Detect.Active.
I did comment out goto err_stop_link if dw_pcie_wait_for_link(), so I can dump
LTSSM afterwards, when this happens.
[ 2.293785] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_COMPLIANCE
Then I do:
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_COMPLIANCE (0x03)
So LTSSM is still in Poll.Compliance.
However, as soon as I do the configfs writes on the EP board:
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
L0 (0x11)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
L0 (0x11)
LTSSM transitions out of compliance, and rescan will find my device:
# echo 1 > /sys/bus/pci/devices/0000:00:00.0/rescan
[ 246.777867] pci 0000:01:00.0: [1d87:3588] type 00 class 0xff0000 PCIe Endpoint
[ 246.778627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff]
[ 246.779151] pci 0000:01:00.0: BAR 1 [mem 0x00000000-0x000fffff]
[ 246.779672] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x000fffff]
[ 246.780192] pci 0000:01:00.0: BAR 3 [mem 0x00000000-0x000fffff]
[ 246.780716] pci 0000:01:00.0: BAR 5 [mem 0x00000000-0x000fffff]
[ 246.781236] pci 0000:01:00.0: ROM [mem 0x00000000-0x0000ffff pref]
I understand that in most normal situations, the endpoint is powered on
before powering on the host side (or there is no EP connected at all).
But somehow, for us PCIe endpoint developers, it would be nice if we
could keep the behavior of being able to rescan the bus, even when the EP
is not powered on before the host side.
Perhaps a Kconfig or module param? Suggestions?
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-02 12:01 ` [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Niklas Cassel
@ 2026-01-05 11:41 ` Manivannan Sadhasivam
2026-01-07 12:52 ` Niklas Cassel
2026-01-21 12:45 ` Shawn Lin
1 sibling, 1 reply; 17+ messages in thread
From: Manivannan Sadhasivam @ 2026-01-05 11:41 UTC (permalink / raw)
To: Niklas Cassel
Cc: manivannan.sadhasivam, Jingoo Han, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, Shawn Lin, dlemoal
On Fri, Jan 02, 2026 at 01:01:02PM +0100, Niklas Cassel wrote:
> On Tue, Dec 30, 2025 at 08:37:31PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > Hi,
> >
> > This series reworks the dw_pcie_wait_for_link() API to allow the callers to
> > detect the absence of the device on the bus and skip the failure.
> >
> > Compared to v2, I've reworked the patch 2 to improve the API further and
> > dropped the patch 1 that got applied (hence changed the subject). I've also
> > modified the error code based on the feedback in v2 to return -ENODEV if device
> > is not detected on the bus and -ETIMEDOUT otherwise. This allows the callers to
> > skip the failure if device is not detected and handle error for other failure.
> >
> > Testing
> > =======
> >
> > Tested this series on Rb3Gen2 board without powering on the PCIe switch. Now the
> > dw_pcie_wait_for_link() API prints:
> >
> > qcom-pcie 1c08000.pcie: Device not found
> >
> > Instead of the previous log:
> >
> > qcom-pcie 1c08000.pcie: Phy link never came up
>
> Hello Mani,
>
> I really like this series.
>
> However when testing my usual setup with 2 Rock 5B:s, one in EP mode, one
> in RC mode, where I usually power on both boards at the same time, but only
> after both boards are booted, do I do the configfs write to enable the link
> training on EP, and then do a rescan on the RC.
>
> Even with this series, this workflow still works in 8 out of 10 boots.
>
>
> However, in 2 out of 10 boots I instead got:
> [ 2.285827] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_COMPLIANCE
> [ 2.286584] rockchip-dw-pcie a40000000.pcie: probe with driver rockchip-dw-pcie failed with error -110
>
> In both cases LTSSM was in POLL_COMPLIANCE.
>
>
> Considering that things work in 8 out of 10 boots, means that the LTSSM state
> was in Detect.Quiet or Detect.Active.
>
> I did comment out goto err_stop_link if dw_pcie_wait_for_link(), so I can dump
> LTSSM afterwards, when this happens.
>
> [ 2.293785] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_COMPLIANCE
>
> Then I do:
>
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_COMPLIANCE (0x03)
>
> So LTSSM is still in Poll.Compliance.
>
> However, as soon as I do the configfs writes on the EP board:
>
>
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> L0 (0x11)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> L0 (0x11)
>
> LTSSM transitions out of compliance, and rescan will find my device:
>
> # echo 1 > /sys/bus/pci/devices/0000:00:00.0/rescan
> [ 246.777867] pci 0000:01:00.0: [1d87:3588] type 00 class 0xff0000 PCIe Endpoint
> [ 246.778627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff]
> [ 246.779151] pci 0000:01:00.0: BAR 1 [mem 0x00000000-0x000fffff]
> [ 246.779672] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x000fffff]
> [ 246.780192] pci 0000:01:00.0: BAR 3 [mem 0x00000000-0x000fffff]
> [ 246.780716] pci 0000:01:00.0: BAR 5 [mem 0x00000000-0x000fffff]
> [ 246.781236] pci 0000:01:00.0: ROM [mem 0x00000000-0x0000ffff pref]
>
>
>
> I understand that in most normal situations, the endpoint is powered on
> before powering on the host side (or there is no EP connected at all).
> But somehow, for us PCIe endpoint developers, it would be nice if we
> could keep the behavior of being able to rescan the bus, even when the EP
> is not powered on before the host side.
>
What could be happening here is that since the endpoint is physically connected
to the bus, the receiver gets detected during Detect.Active state and LTSSM
enters the Polling state. I think the reason why it ended up staying in
Poll.Compliance could be due to (as per the spec):
a. Not all Lanes from the predetermined set of Lanes from above have
detected an exit from Electrical Idle since entering Polling.Active.
b. Any Lane that detected a Receiver during Detect received eight consecutive
TS1 Ordered Sets (or their complement) with the Lane and Link numbers set to
PAD, the Compliance Receive bit (bit 4 of Symbol 5) is 1b, and the Loopback bit
(bit 2 of Symbol 5) is 0b that the Compliance Receive bit (bit 4 of Symbol 5) is
set.
So this is perfectly legal from endpoint perspective.
> Perhaps a Kconfig or module param? Suggestions?
>
There is a DIRECT_POLCOMP_TO_DETECT bit (bit 9) in DBI SD_CONTROL2 register.
This bit will ensure that the LTSSM will not stuck in Poll.Compliance and will
return back to Detect state. Could you set it on the EP before starting LTSSM
and see if it helps?
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-05 11:41 ` Manivannan Sadhasivam
@ 2026-01-07 12:52 ` Niklas Cassel
2026-01-09 16:21 ` Niklas Cassel
0 siblings, 1 reply; 17+ messages in thread
From: Niklas Cassel @ 2026-01-07 12:52 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: manivannan.sadhasivam, Jingoo Han, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, Shawn Lin, dlemoal
On Mon, Jan 05, 2026 at 05:11:42PM +0530, Manivannan Sadhasivam wrote:
> On Fri, Jan 02, 2026 at 01:01:02PM +0100, Niklas Cassel wrote:
> > On Tue, Dec 30, 2025 at 08:37:31PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
>
> What could be happening here is that since the endpoint is physically connected
> to the bus, the receiver gets detected during Detect.Active state and LTSSM
> enters the Polling state. I think the reason why it ended up staying in
> Poll.Compliance could be due to (as per the spec):
>
> a. Not all Lanes from the predetermined set of Lanes from above have
> detected an exit from Electrical Idle since entering Polling.Active.
>
> b. Any Lane that detected a Receiver during Detect received eight consecutive
> TS1 Ordered Sets (or their complement) with the Lane and Link numbers set to
> PAD, the Compliance Receive bit (bit 4 of Symbol 5) is 1b, and the Loopback bit
> (bit 2 of Symbol 5) is 0b that the Compliance Receive bit (bit 4 of Symbol 5) is
> set.
>
> So this is perfectly legal from endpoint perspective.
>
> > Perhaps a Kconfig or module param? Suggestions?
> >
>
> There is a DIRECT_POLCOMP_TO_DETECT bit (bit 9) in DBI SD_CONTROL2 register.
> This bit will ensure that the LTSSM will not stuck in Poll.Compliance and will
> return back to Detect state. Could you set it on the EP before starting LTSSM
> and see if it helps?
I will test and get back to you.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-07 12:52 ` Niklas Cassel
@ 2026-01-09 16:21 ` Niklas Cassel
2026-01-16 8:57 ` Manivannan Sadhasivam
0 siblings, 1 reply; 17+ messages in thread
From: Niklas Cassel @ 2026-01-09 16:21 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: manivannan.sadhasivam, Jingoo Han, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, Shawn Lin, dlemoal
Hello Mani,
On Wed, Jan 07, 2026 at 01:52:57PM +0100, Niklas Cassel wrote:
> On Mon, Jan 05, 2026 at 05:11:42PM +0530, Manivannan Sadhasivam wrote:
> > On Fri, Jan 02, 2026 at 01:01:02PM +0100, Niklas Cassel wrote:
> > > On Tue, Dec 30, 2025 at 08:37:31PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> >
> > What could be happening here is that since the endpoint is physically connected
> > to the bus, the receiver gets detected during Detect.Active state and LTSSM
> > enters the Polling state. I think the reason why it ended up staying in
> > Poll.Compliance could be due to (as per the spec):
> >
> > a. Not all Lanes from the predetermined set of Lanes from above have
> > detected an exit from Electrical Idle since entering Polling.Active.
> >
> > b. Any Lane that detected a Receiver during Detect received eight consecutive
> > TS1 Ordered Sets (or their complement) with the Lane and Link numbers set to
> > PAD, the Compliance Receive bit (bit 4 of Symbol 5) is 1b, and the Loopback bit
> > (bit 2 of Symbol 5) is 0b that the Compliance Receive bit (bit 4 of Symbol 5) is
> > set.
> >
> > So this is perfectly legal from endpoint perspective.
> >
> > > Perhaps a Kconfig or module param? Suggestions?
> > >
> >
> > There is a DIRECT_POLCOMP_TO_DETECT bit (bit 9) in DBI SD_CONTROL2 register.
> > This bit will ensure that the LTSSM will not stuck in Poll.Compliance and will
> > return back to Detect state. Could you set it on the EP before starting LTSSM
> > and see if it helps?
>
> I will test and get back to you.
Looking at the databook, it appears that the SD_CONTROL2 register only exists
if CX_RAS_DES_ENABLE, and the register is located in RAS_DES capability.
RK3588 implements the RAS_DES capability, so it can set that bit, but most
likely there are some platforms that do not.
Anyway, I tried the following patch:
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index c30a2ed324cd..73d3d4bc1886 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -584,6 +584,7 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
struct device_node *np = dev->of_node;
struct pci_host_bridge *bridge;
int ret;
+ int ras_cap;
raw_spin_lock_init(&pp->lock);
@@ -670,6 +671,15 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
if (ret)
goto err_remove_edma;
+#define SD_CONTROL2_REG 0xa4
+ ras_cap = dw_pcie_find_rasdes_capability(pci);
+ if (ras_cap) {
+ u32 val;
+ val = dw_pcie_readl_dbi(pci, ras_cap + SD_CONTROL2_REG);
+ val |= BIT(9);
+ dw_pcie_writel_dbi(pci, ras_cap + SD_CONTROL2_REG, val);
+ }
+
if (!dw_pcie_link_up(pci)) {
ret = dw_pcie_start_link(pci);
if (ret)
And now, every second or third boot, LTSSM is no longer in POLL_COMPLIANCE,
instead of every second or third boot, LTSSM is now always in:
[ 2.298107] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_ACTIVE
I did comment out goto err_stop_link if dw_pcie_wait_for_link(), so I can dump
LTSSM afterwards, when this happens:
[ 2.297916] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_ACTIVE
Then I do:
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_ACT (0x01)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_ACT (0x01)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
POLL_ACTIVE (0x02)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
DETECT_QUIET (0x00)
So it appears that after setting the DIRECT_POLCOMP_TO_DETECT bit,
instead of LTSSM being stuck in POLL_COMPLIANCE, LTSSM seems to
jump between DETECT_QUIET / DETECT_ACT / POLL_ACTIVE.
And just like before, as soon as I do the configfs writes on the EP board
to start the link:
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
L0 (0x11)
# cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
L0 (0x11)
LTSSM transitions out of compliance, and rescan finds my device.
So I don't think that setting the DIRECT_POLCOMP_TO_DETECT bit will
help us PCIe endpoint developers to continue with the workflow where we
can simply do a rescan on the host after starting the link training on
the EP.
Back to finding another alternative. Kconfig? module param? Suggestions?
Kind regards,
Niklas
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-09 16:21 ` Niklas Cassel
@ 2026-01-16 8:57 ` Manivannan Sadhasivam
2026-01-20 14:35 ` Niklas Cassel
0 siblings, 1 reply; 17+ messages in thread
From: Manivannan Sadhasivam @ 2026-01-16 8:57 UTC (permalink / raw)
To: Niklas Cassel
Cc: manivannan.sadhasivam, Jingoo Han, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, Shawn Lin, dlemoal
On Fri, Jan 09, 2026 at 05:21:37PM +0100, Niklas Cassel wrote:
> Hello Mani,
>
> On Wed, Jan 07, 2026 at 01:52:57PM +0100, Niklas Cassel wrote:
> > On Mon, Jan 05, 2026 at 05:11:42PM +0530, Manivannan Sadhasivam wrote:
> > > On Fri, Jan 02, 2026 at 01:01:02PM +0100, Niklas Cassel wrote:
> > > > On Tue, Dec 30, 2025 at 08:37:31PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > >
> > > What could be happening here is that since the endpoint is physically connected
> > > to the bus, the receiver gets detected during Detect.Active state and LTSSM
> > > enters the Polling state. I think the reason why it ended up staying in
> > > Poll.Compliance could be due to (as per the spec):
> > >
> > > a. Not all Lanes from the predetermined set of Lanes from above have
> > > detected an exit from Electrical Idle since entering Polling.Active.
> > >
> > > b. Any Lane that detected a Receiver during Detect received eight consecutive
> > > TS1 Ordered Sets (or their complement) with the Lane and Link numbers set to
> > > PAD, the Compliance Receive bit (bit 4 of Symbol 5) is 1b, and the Loopback bit
> > > (bit 2 of Symbol 5) is 0b that the Compliance Receive bit (bit 4 of Symbol 5) is
> > > set.
> > >
> > > So this is perfectly legal from endpoint perspective.
> > >
> > > > Perhaps a Kconfig or module param? Suggestions?
> > > >
> > >
> > > There is a DIRECT_POLCOMP_TO_DETECT bit (bit 9) in DBI SD_CONTROL2 register.
> > > This bit will ensure that the LTSSM will not stuck in Poll.Compliance and will
> > > return back to Detect state. Could you set it on the EP before starting LTSSM
> > > and see if it helps?
> >
> > I will test and get back to you.
>
> Looking at the databook, it appears that the SD_CONTROL2 register only exists
> if CX_RAS_DES_ENABLE, and the register is located in RAS_DES capability.
>
> RK3588 implements the RAS_DES capability, so it can set that bit, but most
> likely there are some platforms that do not.
>
True.
>
> Anyway, I tried the following patch:
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index c30a2ed324cd..73d3d4bc1886 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -584,6 +584,7 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
> struct device_node *np = dev->of_node;
> struct pci_host_bridge *bridge;
> int ret;
> + int ras_cap;
>
> raw_spin_lock_init(&pp->lock);
>
> @@ -670,6 +671,15 @@ int dw_pcie_host_init(struct dw_pcie_rp *pp)
> if (ret)
> goto err_remove_edma;
>
> +#define SD_CONTROL2_REG 0xa4
> + ras_cap = dw_pcie_find_rasdes_capability(pci);
> + if (ras_cap) {
> + u32 val;
> + val = dw_pcie_readl_dbi(pci, ras_cap + SD_CONTROL2_REG);
> + val |= BIT(9);
> + dw_pcie_writel_dbi(pci, ras_cap + SD_CONTROL2_REG, val);
> + }
> +
> if (!dw_pcie_link_up(pci)) {
> ret = dw_pcie_start_link(pci);
> if (ret)
>
>
> And now, every second or third boot, LTSSM is no longer in POLL_COMPLIANCE,
> instead of every second or third boot, LTSSM is now always in:
> [ 2.298107] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_ACTIVE
>
>
> I did comment out goto err_stop_link if dw_pcie_wait_for_link(), so I can dump
> LTSSM afterwards, when this happens:
>
> [ 2.297916] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_ACTIVE
>
> Then I do:
>
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_ACT (0x01)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_ACT (0x01)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_ACTIVE (0x02)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> DETECT_QUIET (0x00)
>
> So it appears that after setting the DIRECT_POLCOMP_TO_DETECT bit,
> instead of LTSSM being stuck in POLL_COMPLIANCE, LTSSM seems to
> jump between DETECT_QUIET / DETECT_ACT / POLL_ACTIVE.
>
Thanks for testing it out. I was expecting the device to just stay in the DETECT
states, but looks like the cycle just continues, which is also fair.
>
> And just like before, as soon as I do the configfs writes on the EP board
> to start the link:
>
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> L0 (0x11)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> L0 (0x11)
>
> LTSSM transitions out of compliance, and rescan finds my device.
>
>
> So I don't think that setting the DIRECT_POLCOMP_TO_DETECT bit will
> help us PCIe endpoint developers to continue with the workflow where we
> can simply do a rescan on the host after starting the link training on
> the EP.
>
> Back to finding another alternative. Kconfig? module param? Suggestions?
>
I don't like the user to control this behavior as it is just how the link
behaves. Maybe we can allow the link to stay in POLL and print out a different
message, and still return -ENODEV? Like,
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index c2dfadc53d04..21ce206f359b 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -774,6 +774,14 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
ltssm == DW_PCIE_LTSSM_DETECT_ACT) {
dev_info(pci->dev, "Device not found\n");
return -ENODEV;
+ /*
+ * If the link is in POLL.Compliance state, then the device is
+ * found to be connected to the bus, but it is not active i.e.,
+ * the device firmware might not yet initialized.
+ */
+ } else if (ltssm == DW_PCIE_LTSSM_POLL_COMPLIANCE) {
+ dev_info(pci->dev, "Device found, but not active\n");
+ return -ENODEV;
}
dev_err(pci->dev, "Link failed to come up. LTSSM: %s\n",
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply related [flat|nested] 17+ messages in thread* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-16 8:57 ` Manivannan Sadhasivam
@ 2026-01-20 14:35 ` Niklas Cassel
0 siblings, 0 replies; 17+ messages in thread
From: Niklas Cassel @ 2026-01-20 14:35 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: manivannan.sadhasivam, Jingoo Han, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, Shawn Lin, dlemoal
On Fri, Jan 16, 2026 at 02:27:39PM +0530, Manivannan Sadhasivam wrote:
(snip)
> > So I don't think that setting the DIRECT_POLCOMP_TO_DETECT bit will
> > help us PCIe endpoint developers to continue with the workflow where we
> > can simply do a rescan on the host after starting the link training on
> > the EP.
> >
> > Back to finding another alternative. Kconfig? module param? Suggestions?
> >
>
> I don't like the user to control this behavior as it is just how the link
> behaves. Maybe we can allow the link to stay in POLL and print out a different
> message, and still return -ENODEV? Like,
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index c2dfadc53d04..21ce206f359b 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -774,6 +774,14 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
> ltssm == DW_PCIE_LTSSM_DETECT_ACT) {
> dev_info(pci->dev, "Device not found\n");
> return -ENODEV;
> + /*
> + * If the link is in POLL.Compliance state, then the device is
> + * found to be connected to the bus, but it is not active i.e.,
> + * the device firmware might not yet initialized.
> + */
> + } else if (ltssm == DW_PCIE_LTSSM_POLL_COMPLIANCE) {
> + dev_info(pci->dev, "Device found, but not active\n");
> + return -ENODEV;
> }
>
> dev_err(pci->dev, "Link failed to come up. LTSSM: %s\n",
Seems like an excellent idea to me!
I tested it, and it works, thank you.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-02 12:01 ` [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Niklas Cassel
2026-01-05 11:41 ` Manivannan Sadhasivam
@ 2026-01-21 12:45 ` Shawn Lin
2026-01-21 13:22 ` Niklas Cassel
1 sibling, 1 reply; 17+ messages in thread
From: Shawn Lin @ 2026-01-21 12:45 UTC (permalink / raw)
To: Niklas Cassel
Cc: shawn.lin, Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, dlemoal,
manivannan.sadhasivam
在 2026/01/02 星期五 20:01, Niklas Cassel 写道:
> On Tue, Dec 30, 2025 at 08:37:31PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
>> Hi,
>>
>> This series reworks the dw_pcie_wait_for_link() API to allow the callers to
>> detect the absence of the device on the bus and skip the failure.
>>
>> Compared to v2, I've reworked the patch 2 to improve the API further and
>> dropped the patch 1 that got applied (hence changed the subject). I've also
>> modified the error code based on the feedback in v2 to return -ENODEV if device
>> is not detected on the bus and -ETIMEDOUT otherwise. This allows the callers to
>> skip the failure if device is not detected and handle error for other failure.
>>
>> Testing
>> =======
>>
>> Tested this series on Rb3Gen2 board without powering on the PCIe switch. Now the
>> dw_pcie_wait_for_link() API prints:
>>
>> qcom-pcie 1c08000.pcie: Device not found
>>
>> Instead of the previous log:
>>
>> qcom-pcie 1c08000.pcie: Phy link never came up
>
> Hello Mani,
>
> I really like this series.
>
> However when testing my usual setup with 2 Rock 5B:s, one in EP mode, one
> in RC mode, where I usually power on both boards at the same time, but only
> after both boards are booted, do I do the configfs write to enable the link
> training on EP, and then do a rescan on the RC.
>
> Even with this series, this workflow still works in 8 out of 10 boots.
>
>
> However, in 2 out of 10 boots I instead got:
> [ 2.285827] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_COMPLIANCE
> [ 2.286584] rockchip-dw-pcie a40000000.pcie: probe with driver rockchip-dw-pcie failed with error -110
>
> In both cases LTSSM was in POLL_COMPLIANCE.
>
>
> Considering that things work in 8 out of 10 boots, means that the LTSSM state
> was in Detect.Quiet or Detect.Active.
>
> I did comment out goto err_stop_link if dw_pcie_wait_for_link(), so I can dump
> LTSSM afterwards, when this happens.
>
> [ 2.293785] rockchip-dw-pcie a40000000.pcie: Link failed to come up. LTSSM: POLL_COMPLIANCE
>
> Then I do:
>
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> POLL_COMPLIANCE (0x03)
>
> So LTSSM is still in Poll.Compliance.
>
> However, as soon as I do the configfs writes on the EP board:
>
>
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> L0 (0x11)
> # cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status
> L0 (0x11)
>
> LTSSM transitions out of compliance, and rescan will find my device:
>
> # echo 1 > /sys/bus/pci/devices/0000:00:00.0/rescan
> [ 246.777867] pci 0000:01:00.0: [1d87:3588] type 00 class 0xff0000 PCIe Endpoint
> [ 246.778627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff]
> [ 246.779151] pci 0000:01:00.0: BAR 1 [mem 0x00000000-0x000fffff]
> [ 246.779672] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x000fffff]
> [ 246.780192] pci 0000:01:00.0: BAR 3 [mem 0x00000000-0x000fffff]
> [ 246.780716] pci 0000:01:00.0: BAR 5 [mem 0x00000000-0x000fffff]
> [ 246.781236] pci 0000:01:00.0: ROM [mem 0x00000000-0x0000ffff pref]
>
>
>
> I understand that in most normal situations, the endpoint is powered on
> before powering on the host side (or there is no EP connected at all).
> But somehow, for us PCIe endpoint developers, it would be nice if we
> could keep the behavior of being able to rescan the bus, even when the EP
> is not powered on before the host side.
>
> Perhaps a Kconfig or module param? Suggestions?
>
Hi Niklas,
Sorry for chiming in on this so late. There is a register called
PCIE_CLIENT_GENERAL_DEBUG_CON you may find on RK3588 TRM, you could
hold LTSSM on EP side in DETECT_QUIET before enabling trainning, by
setting BIT(6). And when EP side is ready to go, just clear BIT(6),
so the link is able to be established and host side can rescan to
find the EP properly.
>
> Kind regards,
> Niklas
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-21 12:45 ` Shawn Lin
@ 2026-01-21 13:22 ` Niklas Cassel
2026-01-21 15:47 ` Manivannan Sadhasivam
2026-01-22 3:37 ` Shawn Lin
0 siblings, 2 replies; 17+ messages in thread
From: Niklas Cassel @ 2026-01-21 13:22 UTC (permalink / raw)
To: Shawn Lin
Cc: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, dlemoal,
manivannan.sadhasivam
Hello Shawn,
On Wed, Jan 21, 2026 at 08:45:39PM +0800, Shawn Lin wrote:
> 在 2026/01/02 星期五 20:01, Niklas Cassel 写道:
>
> Hi Niklas,
>
> Sorry for chiming in on this so late. There is a register called
> PCIE_CLIENT_GENERAL_DEBUG_CON you may find on RK3588 TRM, you could
> hold LTSSM on EP side in DETECT_QUIET before enabling trainning, by
> setting BIT(6). And when EP side is ready to go, just clear BIT(6),
> so the link is able to be established and host side can rescan to
> find the EP properly.
Thank you for the suggestion.
Reading the register description of this debug control register.
For as log as sd_hold_ltssm is set, the controller stays in the
current LTSSM.
We could probably set this on the EP side, and only when starting
the link do we clear this bit.
However, I think that Mani's current proposal:
https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=controller/dwc&id=01d16b8afb7afcc17f999f8b4a9b9cfe6c6fae71
Will work with more controllers running in EP mode, not just rk3588.
Also, when powering on both boards at the same time, it is possible that
the host side driver gets probed before the EP side.
If the EP side driver has not been probed to set bit sd_hold_ltssm,
the host will still see a load connected, but link training will fail,
so it will still jump to Poll.Compliance.
So AFAICT, Mani's proposal:
1) Seems more generic.
2) Seems less racy.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-21 13:22 ` Niklas Cassel
@ 2026-01-21 15:47 ` Manivannan Sadhasivam
2026-01-22 3:37 ` Shawn Lin
1 sibling, 0 replies; 17+ messages in thread
From: Manivannan Sadhasivam @ 2026-01-21 15:47 UTC (permalink / raw)
To: Niklas Cassel
Cc: Shawn Lin, Jingoo Han, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, dlemoal,
manivannan.sadhasivam
On Wed, Jan 21, 2026 at 02:22:17PM +0100, Niklas Cassel wrote:
> Hello Shawn,
>
> On Wed, Jan 21, 2026 at 08:45:39PM +0800, Shawn Lin wrote:
> > 在 2026/01/02 星期五 20:01, Niklas Cassel 写道:
> >
> > Hi Niklas,
> >
> > Sorry for chiming in on this so late. There is a register called
> > PCIE_CLIENT_GENERAL_DEBUG_CON you may find on RK3588 TRM, you could
> > hold LTSSM on EP side in DETECT_QUIET before enabling trainning, by
> > setting BIT(6). And when EP side is ready to go, just clear BIT(6),
> > so the link is able to be established and host side can rescan to
> > find the EP properly.
>
> Thank you for the suggestion.
>
> Reading the register description of this debug control register.
> For as log as sd_hold_ltssm is set, the controller stays in the
> current LTSSM.
>
> We could probably set this on the EP side, and only when starting
> the link do we clear this bit.
>
I did consider this option before, but the problem with sd_hold_ltssm is that we
cannot predict the LTSSM state when the hold happens i.e., it will only hold
onto the current LTSSM state at the point of set, which might not be DETECT all
the time.
> However, I think that Mani's current proposal:
> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=controller/dwc&id=01d16b8afb7afcc17f999f8b4a9b9cfe6c6fae71
>
>
> Will work with more controllers running in EP mode, not just rk3588.
>
> Also, when powering on both boards at the same time, it is possible that
> the host side driver gets probed before the EP side.
> If the EP side driver has not been probed to set bit sd_hold_ltssm,
> the host will still see a load connected, but link training will fail,
> so it will still jump to Poll.Compliance.
>
Exactly. By the time the hold happens, LTSSM could've moved to POLL states.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-21 13:22 ` Niklas Cassel
2026-01-21 15:47 ` Manivannan Sadhasivam
@ 2026-01-22 3:37 ` Shawn Lin
1 sibling, 0 replies; 17+ messages in thread
From: Shawn Lin @ 2026-01-22 3:37 UTC (permalink / raw)
To: Niklas Cassel
Cc: shawn.lin, Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, vincent.guittot, zhangsenchuan, dlemoal,
manivannan.sadhasivam
在 2026/01/21 星期三 21:22, Niklas Cassel 写道:
> Hello Shawn,
>
> On Wed, Jan 21, 2026 at 08:45:39PM +0800, Shawn Lin wrote:
>> 在 2026/01/02 星期五 20:01, Niklas Cassel 写道:
>>
>> Hi Niklas,
>>
>> Sorry for chiming in on this so late. There is a register called
>> PCIE_CLIENT_GENERAL_DEBUG_CON you may find on RK3588 TRM, you could
>> hold LTSSM on EP side in DETECT_QUIET before enabling trainning, by
>> setting BIT(6). And when EP side is ready to go, just clear BIT(6),
>> so the link is able to be established and host side can rescan to
>> find the EP properly.
>
> Thank you for the suggestion.
>
> Reading the register description of this debug control register.
> For as log as sd_hold_ltssm is set, the controller stays in the
> current LTSSM.
>
> We could probably set this on the EP side, and only when starting
> the link do we clear this bit.
>
> However, I think that Mani's current proposal:
> https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=controller/dwc&id=01d16b8afb7afcc17f999f8b4a9b9cfe6c6fae71
>
>
> Will work with more controllers running in EP mode, not just rk3588.
>
> Also, when powering on both boards at the same time, it is possible that
> the host side driver gets probed before the EP side.
> If the EP side driver has not been probed to set bit sd_hold_ltssm,
> the host will still see a load connected, but link training will fail,
> so it will still jump to Poll.Compliance.
>
> So AFAICT, Mani's proposal:
> 1) Seems more generic.
> 2) Seems less racy.
>
Just a update for what I found when playing with the IP
on this topic :)
1. enable PCIE_CAP_ENTER_COMPLIANCE bit for
LINK_CONTROL2_LINK_STATUS2_REG (PCIE_CAP + 0xa0), which forces the RP
into poll.compliance to sumulate this situation, even without devices
connected to the slot.
2. set DIRECT_POLCOMP_TO_DETECT(bit 9) and HOLD_LTSSM(bit 0) to
SD_CONTROL2_REG, and could find it's still in poll.compliance.
3. clr HOLD_LTSSM only, and the LTSMM back to detect.quite and stay
there.
This seems be able to work with dwc with RAS_DES support.But sure,
Mani's approach looks more stright forward and works for all
controllers.
>
> Kind regards,
> Niklas
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2025-12-30 15:07 [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Manivannan Sadhasivam via B4 Relay
` (4 preceding siblings ...)
2026-01-02 12:01 ` [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API Niklas Cassel
@ 2026-01-05 10:04 ` Vincent Guittot
2026-01-05 11:52 ` Manivannan Sadhasivam
5 siblings, 1 reply; 17+ messages in thread
From: Vincent Guittot @ 2026-01-05 10:04 UTC (permalink / raw)
To: manivannan.sadhasivam
Cc: Jingoo Han, Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, zhangsenchuan, Shawn Lin
On Tue, 30 Dec 2025 at 16:07, Manivannan Sadhasivam via B4 Relay
<devnull+manivannan.sadhasivam.oss.qualcomm.com@kernel.org> wrote:
>
> Hi,
>
> This series reworks the dw_pcie_wait_for_link() API to allow the callers to
> detect the absence of the device on the bus and skip the failure.
>
> Compared to v2, I've reworked the patch 2 to improve the API further and
> dropped the patch 1 that got applied (hence changed the subject). I've also
> modified the error code based on the feedback in v2 to return -ENODEV if device
> is not detected on the bus and -ETIMEDOUT otherwise. This allows the callers to
> skip the failure if device is not detected and handle error for other failure.
>
> Testing
> =======
>
> Tested this series on Rb3Gen2 board without powering on the PCIe switch. Now the
> dw_pcie_wait_for_link() API prints:
>
> qcom-pcie 1c08000.pcie: Device not found
>
> Instead of the previous log:
>
> qcom-pcie 1c08000.pcie: Phy link never came up
I tested the patchset with s32g399a-rdb3 and during the resume, I have:
[ 460.255927] s32g-pcie 44100000.pcie: Device not found
[ 460.256021] s32g-pcie 44100000.pcie: PM: dpm_run_callback():
s32g_pcie_resume_noirq returns -19
[ 460.256278] s32g-pcie 44100000.pcie: PM: failed to resume noirq: error -19
I was not expecting more lines than the 1st line: Device not found,
like the init
[ 2.668921] s32g-pcie 44100000.pcie: Device not found
[ 2.675342] s32g-pcie 44100000.pcie: PCI host bridge to bus 0001:00
where with skip the -ENODEV case if dw_pcie_wait_for_link() fails
Should we skip the -ENODEV case in dw_pcie_resume_noirq() too ?
>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> ---
> Changes in v3:
> - Dropped patch 1 that got appplied
> - Reworked the error handling of dw_pcie_wait_for_link() API further
> - Link to v2: https://lore.kernel.org/r/20251218-pci-dwc-suspend-rework-v2-0-5a7778c6094a@oss.qualcomm.com
>
> Changes in v2:
> - Changed the logic to check for Detect.Quiet/Active states
> - Collected tags and rebased on top of v6.19-rc1
> - Link to v1: https://lore.kernel.org/r/20251119-pci-dwc-suspend-rework-v1-0-aad104828562@oss.qualcomm.com
>
> ---
> Manivannan Sadhasivam (4):
> PCI: dwc: Return -ENODEV from dw_pcie_wait_for_link() if device is not found
> PCI: dwc: Rename and move ltssm_status_string() to pcie-designware.c
> PCI: dwc: Rework the error print of dw_pcie_wait_for_link()
> PCI: dwc: Only skip the dw_pcie_wait_for_link() failure if it returns -ENODEV
>
> .../pci/controller/dwc/pcie-designware-debugfs.c | 54 +---------------
> drivers/pci/controller/dwc/pcie-designware-host.c | 6 +-
> drivers/pci/controller/dwc/pcie-designware.c | 75 +++++++++++++++++++++-
> drivers/pci/controller/dwc/pcie-designware.h | 2 +
> 4 files changed, 80 insertions(+), 57 deletions(-)
> ---
> base-commit: 68ac85fb42cfeb081cf029acdd8aace55ed375a2
> change-id: 20251119-pci-dwc-suspend-rework-8b0515a38679
>
> Best regards,
> --
> Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
>
>
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH v3 0/4] PCI: dwc: Rework the error handling of dw_pcie_wait_for_link() API
2026-01-05 10:04 ` Vincent Guittot
@ 2026-01-05 11:52 ` Manivannan Sadhasivam
0 siblings, 0 replies; 17+ messages in thread
From: Manivannan Sadhasivam @ 2026-01-05 11:52 UTC (permalink / raw)
To: Vincent Guittot
Cc: manivannan.sadhasivam, Jingoo Han, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
linux-kernel, zhangsenchuan, Shawn Lin
On Mon, Jan 05, 2026 at 11:04:21AM +0100, Vincent Guittot wrote:
> On Tue, 30 Dec 2025 at 16:07, Manivannan Sadhasivam via B4 Relay
> <devnull+manivannan.sadhasivam.oss.qualcomm.com@kernel.org> wrote:
> >
> > Hi,
> >
> > This series reworks the dw_pcie_wait_for_link() API to allow the callers to
> > detect the absence of the device on the bus and skip the failure.
> >
> > Compared to v2, I've reworked the patch 2 to improve the API further and
> > dropped the patch 1 that got applied (hence changed the subject). I've also
> > modified the error code based on the feedback in v2 to return -ENODEV if device
> > is not detected on the bus and -ETIMEDOUT otherwise. This allows the callers to
> > skip the failure if device is not detected and handle error for other failure.
> >
> > Testing
> > =======
> >
> > Tested this series on Rb3Gen2 board without powering on the PCIe switch. Now the
> > dw_pcie_wait_for_link() API prints:
> >
> > qcom-pcie 1c08000.pcie: Device not found
> >
> > Instead of the previous log:
> >
> > qcom-pcie 1c08000.pcie: Phy link never came up
>
> I tested the patchset with s32g399a-rdb3 and during the resume, I have:
>
> [ 460.255927] s32g-pcie 44100000.pcie: Device not found
> [ 460.256021] s32g-pcie 44100000.pcie: PM: dpm_run_callback():
> s32g_pcie_resume_noirq returns -19
> [ 460.256278] s32g-pcie 44100000.pcie: PM: failed to resume noirq: error -19
>
> I was not expecting more lines than the 1st line: Device not found,
> like the init
>
> [ 2.668921] s32g-pcie 44100000.pcie: Device not found
> [ 2.675342] s32g-pcie 44100000.pcie: PCI host bridge to bus 0001:00
>
> where with skip the -ENODEV case if dw_pcie_wait_for_link() fails
>
> Should we skip the -ENODEV case in dw_pcie_resume_noirq() too ?
>
I proposed it initially, but then there were concerns raised that there is a
possibility that the device could be removed during suspend and we would fail to
detect it.
But I think that could be handled by checking for 'pci_bus::devices' list. If
this list is empty, then we for sure know that there was no device connected to
the bus before suspend. So if dw_pcie_wait_for_link() returns -ENODEV, and then
this list is also empty, we can safely ignore the failure.
I'll do it in v4.
- Mani
> >
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> > ---
> > Changes in v3:
> > - Dropped patch 1 that got appplied
> > - Reworked the error handling of dw_pcie_wait_for_link() API further
> > - Link to v2: https://lore.kernel.org/r/20251218-pci-dwc-suspend-rework-v2-0-5a7778c6094a@oss.qualcomm.com
> >
> > Changes in v2:
> > - Changed the logic to check for Detect.Quiet/Active states
> > - Collected tags and rebased on top of v6.19-rc1
> > - Link to v1: https://lore.kernel.org/r/20251119-pci-dwc-suspend-rework-v1-0-aad104828562@oss.qualcomm.com
> >
> > ---
> > Manivannan Sadhasivam (4):
> > PCI: dwc: Return -ENODEV from dw_pcie_wait_for_link() if device is not found
> > PCI: dwc: Rename and move ltssm_status_string() to pcie-designware.c
> > PCI: dwc: Rework the error print of dw_pcie_wait_for_link()
> > PCI: dwc: Only skip the dw_pcie_wait_for_link() failure if it returns -ENODEV
> >
> > .../pci/controller/dwc/pcie-designware-debugfs.c | 54 +---------------
> > drivers/pci/controller/dwc/pcie-designware-host.c | 6 +-
> > drivers/pci/controller/dwc/pcie-designware.c | 75 +++++++++++++++++++++-
> > drivers/pci/controller/dwc/pcie-designware.h | 2 +
> > 4 files changed, 80 insertions(+), 57 deletions(-)
> > ---
> > base-commit: 68ac85fb42cfeb081cf029acdd8aace55ed375a2
> > change-id: 20251119-pci-dwc-suspend-rework-8b0515a38679
> >
> > Best regards,
> > --
> > Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
> >
> >
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 17+ messages in thread