* [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
@ 2025-05-06 7:39 Niklas Cassel
2025-05-06 7:39 ` [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready Niklas Cassel
` (4 more replies)
0 siblings, 5 replies; 34+ messages in thread
From: Niklas Cassel @ 2025-05-06 7:39 UTC (permalink / raw)
To: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Niklas Cassel, Krishna chaitanya chundru
Cc: Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip, linux-arm-msm
Hello there,
Commit 8d3bf19f1b58 ("PCI: dwc: Don't wait for link up if driver can detect
Link Up event") added support for DWC to not wait for link up before
enumerating the bus. However, we cannot simply enumerate the bus after
receiving a Link Up IRQ, we still need to wait PCIE_T_RRS_READY_MS time
to allow a device to become ready after deasserting PERST. To avoid
bringing back an conditional delay during probe, perform the wait in the
threaded IRQ handler instead.
Please review.
Kind regards,
Niklas
Changes since v1:
-Added missing include pci.h that was lost during rebase.
Niklas Cassel (4):
PCI: dw-rockchip: Do not enumerate bus before endpoint devices are
ready
PCI: qcom: Do not enumerate bus before endpoint devices are ready
PCI: dw-rockchip: Replace PERST sleep time with proper macro
PCI: qcom: Replace PERST sleep time with proper macro
drivers/pci/controller/dwc/pcie-dw-rockchip.c | 4 +++-
drivers/pci/controller/dwc/pcie-qcom.c | 3 ++-
2 files changed, 5 insertions(+), 2 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-06 7:39 [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
@ 2025-05-06 7:39 ` Niklas Cassel
2025-05-06 11:32 ` Laszlo Fiat
` (2 more replies)
2025-05-06 7:39 ` [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro Niklas Cassel
` (3 subsequent siblings)
4 siblings, 3 replies; 34+ messages in thread
From: Niklas Cassel @ 2025-05-06 7:39 UTC (permalink / raw)
To: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Niklas Cassel
Cc: Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can
detect Link Up") changed so that we no longer call dw_pcie_wait_for_link(),
and instead enumerate the bus when receiving a Link Up IRQ.
Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe SSD is
no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
dw-rockchip: Don't wait for link since we can detect Link Up") makes his
SSD functional again.
It seems that we are enumerating the bus before the endpoint is ready.
Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
threaded IRQ handler makes the SSD functional once again.
What appears to happen is that before ec9fd499b9c6, we called
dw_pcie_wait_for_link(), and in the first iteration of the loop, the link
will never be up (because the link was just started),
dw_pcie_wait_for_link() will then sleep for LINK_WAIT_SLEEP_MS (90 ms),
before trying again.
This means that even if a driver was missing a msleep(PCIE_T_RRS_READY_MS)
(100 ms), because of the call to dw_pcie_wait_for_link(), enumerating the
bus would essentially be delayed by that time anyway (because of the sleep
LINK_WAIT_SLEEP_MS (90 ms)).
While we could add the msleep(PCIE_T_RRS_READY_MS) after deasserting PERST,
that would essentially bring back an unconditional delay during synchronous
probe (the whole reason to use a Link Up IRQ was to avoid an unconditional
delay during probe).
Thus, add the msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
IRQ handler. This way, we will not have an unconditional delay during boot
for unpopulated PCIe slots.
Cc: Laszlo Fiat <laszlo.fiat@proton.me>
Fixes: ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
Signed-off-by: Niklas Cassel <cassel@kernel.org>
---
drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
index 3c6ab71c996e..6a7ec3545a7f 100644
--- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
+++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
@@ -23,6 +23,7 @@
#include <linux/reset.h>
#include "pcie-designware.h"
+#include "../../pci.h"
/*
* The upper 16 bits of PCIE_CLIENT_CONFIG are a write
@@ -458,6 +459,7 @@ static irqreturn_t rockchip_pcie_rc_sys_irq_thread(int irq, void *arg)
if (reg & PCIE_RDLH_LINK_UP_CHGED) {
if (rockchip_pcie_link_up(pci)) {
dev_dbg(dev, "Received Link up event. Starting enumeration!\n");
+ msleep(PCIE_T_RRS_READY_MS);
/* Rescan the bus to enumerate endpoint devices */
pci_lock_rescan_remove();
pci_rescan_bus(pp->bridge->bus);
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro
2025-05-06 7:39 [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
2025-05-06 7:39 ` [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready Niklas Cassel
@ 2025-05-06 7:39 ` Niklas Cassel
2025-05-06 9:07 ` Hans Zhang
` (2 more replies)
2025-05-06 14:33 ` [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
` (2 subsequent siblings)
4 siblings, 3 replies; 34+ messages in thread
From: Niklas Cassel @ 2025-05-06 7:39 UTC (permalink / raw)
To: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner
Cc: Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Niklas Cassel, linux-pci, linux-arm-kernel, linux-rockchip
Replace the PERST sleep time with the proper macro (PCIE_T_PVPERL_MS).
No functional change.
Signed-off-by: Niklas Cassel <cassel@kernel.org>
---
drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
index 6a7ec3545a7f..0baf9da3ac1c 100644
--- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
+++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
@@ -225,7 +225,7 @@ static int rockchip_pcie_start_link(struct dw_pcie *pci)
* We need more extra time as before, rather than setting just
* 100us as we don't know how long should the device need to reset.
*/
- msleep(100);
+ msleep(PCIE_T_PVPERL_MS);
gpiod_set_value_cansleep(rockchip->rst_gpio, 1);
return 0;
--
2.49.0
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro
2025-05-06 7:39 ` [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro Niklas Cassel
@ 2025-05-06 9:07 ` Hans Zhang
2025-05-06 11:36 ` Laszlo Fiat
2025-05-06 22:24 ` Wilfred Mallawa
2 siblings, 0 replies; 34+ messages in thread
From: Hans Zhang @ 2025-05-06 9:07 UTC (permalink / raw)
To: Niklas Cassel, Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner
Cc: Wilfred Mallawa, Damien Le Moal, Laszlo Fiat, linux-pci,
linux-arm-kernel, linux-rockchip
On 2025/5/6 15:39, Niklas Cassel wrote:
> Replace the PERST sleep time with the proper macro (PCIE_T_PVPERL_MS).
> No functional change.
>
> Signed-off-by: Niklas Cassel <cassel@kernel.org>
> ---
> drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> index 6a7ec3545a7f..0baf9da3ac1c 100644
> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> @@ -225,7 +225,7 @@ static int rockchip_pcie_start_link(struct dw_pcie *pci)
> * We need more extra time as before, rather than setting just
> * 100us as we don't know how long should the device need to reset.
> */
> - msleep(100);
> + msleep(PCIE_T_PVPERL_MS);
> gpiod_set_value_cansleep(rockchip->rst_gpio, 1);
>
> return 0;
Reviewed-by: Hans Zhang <18255117159@163.com>
Best regards,
Hans
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-06 7:39 ` [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready Niklas Cassel
@ 2025-05-06 11:32 ` Laszlo Fiat
2025-05-06 22:23 ` Wilfred Mallawa
2025-05-28 22:42 ` Bjorn Helgaas
2 siblings, 0 replies; 34+ messages in thread
From: Laszlo Fiat @ 2025-05-06 11:32 UTC (permalink / raw)
To: Niklas Cassel
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Wilfred Mallawa, Damien Le Moal, Hans Zhang,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Tuesday, May 6th, 2025 at 9:39 AM, Niklas Cassel <cassel@kernel.org> wrote:
> Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can
> detect Link Up") changed so that we no longer call dw_pcie_wait_for_link(),
> and instead enumerate the bus when receiving a Link Up IRQ.
>
> Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe SSD is
> no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
> dw-rockchip: Don't wait for link since we can detect Link Up") makes his
> SSD functional again.
>
> It seems that we are enumerating the bus before the endpoint is ready.
> Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> threaded IRQ handler makes the SSD functional once again.
>
> What appears to happen is that before ec9fd499b9c6, we called
> dw_pcie_wait_for_link(), and in the first iteration of the loop, the link
> will never be up (because the link was just started),
> dw_pcie_wait_for_link() will then sleep for LINK_WAIT_SLEEP_MS (90 ms),
> before trying again.
>
> This means that even if a driver was missing a msleep(PCIE_T_RRS_READY_MS)
> (100 ms), because of the call to dw_pcie_wait_for_link(), enumerating the
> bus would essentially be delayed by that time anyway (because of the sleep
> LINK_WAIT_SLEEP_MS (90 ms)).
>
> While we could add the msleep(PCIE_T_RRS_READY_MS) after deasserting PERST,
> that would essentially bring back an unconditional delay during synchronous
> probe (the whole reason to use a Link Up IRQ was to avoid an unconditional
> delay during probe).
>
> Thus, add the msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> IRQ handler. This way, we will not have an unconditional delay during boot
> for unpopulated PCIe slots.
>
> Cc: Laszlo Fiat laszlo.fiat@proton.me
>
> Fixes: ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
> Signed-off-by: Niklas Cassel cassel@kernel.org
>
> ---
> drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> index 3c6ab71c996e..6a7ec3545a7f 100644
> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> @@ -23,6 +23,7 @@
> #include <linux/reset.h>
>
>
> #include "pcie-designware.h"
> +#include "../../pci.h"
>
> /*
> * The upper 16 bits of PCIE_CLIENT_CONFIG are a write
> @@ -458,6 +459,7 @@ static irqreturn_t rockchip_pcie_rc_sys_irq_thread(int irq, void arg)
> if (reg & PCIE_RDLH_LINK_UP_CHGED) {
> if (rockchip_pcie_link_up(pci)) {
> dev_dbg(dev, "Received Link up event. Starting enumeration!\n");
> + msleep(PCIE_T_RRS_READY_MS);
> / Rescan the bus to enumerate endpoint devices */
> pci_lock_rescan_remove();
> pci_rescan_bus(pp->bridge->bus);
>
> --
> 2.49.0
Tested-by: Laszlo Fiat <laszlo.fiat@proton.me>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro
2025-05-06 7:39 ` [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro Niklas Cassel
2025-05-06 9:07 ` Hans Zhang
@ 2025-05-06 11:36 ` Laszlo Fiat
2025-05-06 22:24 ` Wilfred Mallawa
2 siblings, 0 replies; 34+ messages in thread
From: Laszlo Fiat @ 2025-05-06 11:36 UTC (permalink / raw)
To: Niklas Cassel
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Wilfred Mallawa, Damien Le Moal, Hans Zhang, linux-pci,
linux-arm-kernel, linux-rockchip
On Tuesday, May 6th, 2025 at 9:39 AM, Niklas Cassel <cassel@kernel.org> wrote:
> Replace the PERST sleep time with the proper macro (PCIE_T_PVPERL_MS).
> No functional change.
>
> Signed-off-by: Niklas Cassel cassel@kernel.org
>
> ---
> drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> index 6a7ec3545a7f..0baf9da3ac1c 100644
> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> @@ -225,7 +225,7 @@ static int rockchip_pcie_start_link(struct dw_pcie *pci)
> * We need more extra time as before, rather than setting just
> * 100us as we don't know how long should the device need to reset.
> */
> - msleep(100);
> + msleep(PCIE_T_PVPERL_MS);
> gpiod_set_value_cansleep(rockchip->rst_gpio, 1);
>
>
> return 0;
> --
> 2.49.0
Tested-by: Laszlo Fiat <laszlo.fiat@proton.me>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-06 7:39 [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
2025-05-06 7:39 ` [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready Niklas Cassel
2025-05-06 7:39 ` [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro Niklas Cassel
@ 2025-05-06 14:33 ` Niklas Cassel
2025-05-06 14:51 ` Laszlo Fiat
2025-05-13 10:53 ` Manivannan Sadhasivam
2025-05-19 12:37 ` Manivannan Sadhasivam
4 siblings, 1 reply; 34+ messages in thread
From: Niklas Cassel @ 2025-05-06 14:33 UTC (permalink / raw)
To: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Krishna chaitanya chundru
Cc: Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip, linux-arm-msm
On Tue, May 06, 2025 at 09:39:35AM +0200, Niklas Cassel wrote:
> Hello there,
>
> Commit 8d3bf19f1b58 ("PCI: dwc: Don't wait for link up if driver can detect
> Link Up event") added support for DWC to not wait for link up before
> enumerating the bus. However, we cannot simply enumerate the bus after
> receiving a Link Up IRQ, we still need to wait PCIE_T_RRS_READY_MS time
> to allow a device to become ready after deasserting PERST. To avoid
> bringing back an conditional delay during probe, perform the wait in the
> threaded IRQ handler instead.
>
> Please review.
>
>
> Kind regards,
> Niklas
Hello Laszlo,
Since you have a RK3588 based board, sending a Tested-by
tag on testing patches 1-2 makes sense.
However, patches 3-4 are for Qualcomm based boards.
I'm not sure if you also have a Qualcomm board at your disposal.
If you don't, Manivannan can probably drop your Tested-by tags
on patches 3-4, but it might be good to know until next time.
Thank you for testing!
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-06 14:33 ` [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
@ 2025-05-06 14:51 ` Laszlo Fiat
0 siblings, 0 replies; 34+ messages in thread
From: Laszlo Fiat @ 2025-05-06 14:51 UTC (permalink / raw)
To: Niklas Cassel
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Krishna chaitanya chundru, Wilfred Mallawa, Damien Le Moal,
Hans Zhang, Krzysztof Wilczyński, linux-pci@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-rockchip@lists.infradead.org, linux-arm-msm@vger.kernel.org
-------- Original Message --------
On 06/05/2025 16:33, Niklas Cassel <cassel@kernel.org> wrote:
> On Tue, May 06, 2025 at 09:39:35AM +0200, Niklas Cassel wrote:
> > Hello there,
> >
> > Commit 8d3bf19f1b58 ("PCI: dwc: Don't wait for link up if driver can detect
> > Link Up event") added support for DWC to not wait for link up before
> > enumerating the bus. However, we cannot simply enumerate the bus after
> > receiving a Link Up IRQ, we still need to wait PCIE_T_RRS_READY_MS time
> > to allow a device to become ready after deasserting PERST. To avoid
> > bringing back an conditional delay during probe, perform the wait in the
> > threaded IRQ handler instead.
> >
> > Please review.
> >
> >
> > Kind regards,
> > Niklas
>
> Hello Laszlo,
>
> Since you have a RK3588 based board, sending a Tested-by
> tag on testing patches 1-2 makes sense.
>
> However, patches 3-4 are for Qualcomm based boards.
> I'm not sure if you also have a Qualcomm board at your disposal.
>
> If you don't, Manivannan can probably drop your Tested-by tags
> on patches 3-4, but it might be good to know until next time.
Hello Niklas,
You are right, I only have a Rockchip based board, which I have tested on.
Mannivan should drop my Tested-by tags on the qcom patches. My mistake, I am sorry.
>
> Thank you for testing!
>
You are welcome! Thank you for making my board work again.
Bye,
Laszlo
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-06 7:39 ` [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready Niklas Cassel
2025-05-06 11:32 ` Laszlo Fiat
@ 2025-05-06 22:23 ` Wilfred Mallawa
2025-05-28 22:42 ` Bjorn Helgaas
2 siblings, 0 replies; 34+ messages in thread
From: Wilfred Mallawa @ 2025-05-06 22:23 UTC (permalink / raw)
To: robh@kernel.org, kw@linux.com, bhelgaas@google.com,
cassel@kernel.org, manivannan.sadhasivam@linaro.org,
lpieralisi@kernel.org, heiko@sntech.de
Cc: linux-arm-kernel@lists.infradead.org, kwilczynski@kernel.org,
laszlo.fiat@proton.me, dlemoal@kernel.org, 18255117159@163.com,
linux-rockchip@lists.infradead.org, linux-pci@vger.kernel.org
On Tue, 2025-05-06 at 09:39 +0200, Niklas Cassel wrote:
> Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we
> can
> detect Link Up") changed so that we no longer call
> dw_pcie_wait_for_link(),
> and instead enumerate the bus when receiving a Link Up IRQ.
>
> Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe
> SSD is
> no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
> dw-rockchip: Don't wait for link since we can detect Link Up") makes
> his
> SSD functional again.
>
> It seems that we are enumerating the bus before the endpoint is
> ready.
> Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in
> the
> threaded IRQ handler makes the SSD functional once again.
>
> What appears to happen is that before ec9fd499b9c6, we called
> dw_pcie_wait_for_link(), and in the first iteration of the loop, the
> link
> will never be up (because the link was just started),
> dw_pcie_wait_for_link() will then sleep for LINK_WAIT_SLEEP_MS (90
> ms),
> before trying again.
>
> This means that even if a driver was missing a
> msleep(PCIE_T_RRS_READY_MS)
> (100 ms), because of the call to dw_pcie_wait_for_link(), enumerating
> the
> bus would essentially be delayed by that time anyway (because of the
> sleep
> LINK_WAIT_SLEEP_MS (90 ms)).
>
> While we could add the msleep(PCIE_T_RRS_READY_MS) after deasserting
> PERST,
> that would essentially bring back an unconditional delay during
> synchronous
> probe (the whole reason to use a Link Up IRQ was to avoid an
> unconditional
> delay during probe).
>
> Thus, add the msleep(PCIE_T_RRS_READY_MS) before enumerating the bus
> in the
> IRQ handler. This way, we will not have an unconditional delay during
> boot
> for unpopulated PCIe slots.
>
> Cc: Laszlo Fiat <laszlo.fiat@proton.me>
> Fixes: ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we
> can detect Link Up")
> Signed-off-by: Niklas Cassel <cassel@kernel.org>
> ---
> drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> index 3c6ab71c996e..6a7ec3545a7f 100644
> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> @@ -23,6 +23,7 @@
> #include <linux/reset.h>
>
> #include "pcie-designware.h"
> +#include "../../pci.h"
>
> /*
> * The upper 16 bits of PCIE_CLIENT_CONFIG are a write
> @@ -458,6 +459,7 @@ static irqreturn_t
> rockchip_pcie_rc_sys_irq_thread(int irq, void *arg)
> if (reg & PCIE_RDLH_LINK_UP_CHGED) {
> if (rockchip_pcie_link_up(pci)) {
> dev_dbg(dev, "Received Link up event.
> Starting enumeration!\n");
> + msleep(PCIE_T_RRS_READY_MS);
> /* Rescan the bus to enumerate endpoint
> devices */
> pci_lock_rescan_remove();
> pci_rescan_bus(pp->bridge->bus);
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro
2025-05-06 7:39 ` [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro Niklas Cassel
2025-05-06 9:07 ` Hans Zhang
2025-05-06 11:36 ` Laszlo Fiat
@ 2025-05-06 22:24 ` Wilfred Mallawa
2 siblings, 0 replies; 34+ messages in thread
From: Wilfred Mallawa @ 2025-05-06 22:24 UTC (permalink / raw)
To: robh@kernel.org, kw@linux.com, bhelgaas@google.com,
cassel@kernel.org, manivannan.sadhasivam@linaro.org,
lpieralisi@kernel.org, heiko@sntech.de
Cc: linux-arm-kernel@lists.infradead.org, linux-pci@vger.kernel.org,
laszlo.fiat@proton.me, dlemoal@kernel.org, 18255117159@163.com,
linux-rockchip@lists.infradead.org
On Tue, 2025-05-06 at 09:39 +0200, Niklas Cassel wrote:
> Replace the PERST sleep time with the proper macro
> (PCIE_T_PVPERL_MS).
> No functional change.
>
> Signed-off-by: Niklas Cassel <cassel@kernel.org>
> ---
> drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> index 6a7ec3545a7f..0baf9da3ac1c 100644
> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> @@ -225,7 +225,7 @@ static int rockchip_pcie_start_link(struct
> dw_pcie *pci)
> * We need more extra time as before, rather than setting
> just
> * 100us as we don't know how long should the device need to
> reset.
> */
> - msleep(100);
> + msleep(PCIE_T_PVPERL_MS);
> gpiod_set_value_cansleep(rockchip->rst_gpio, 1);
>
> return 0;
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-06 7:39 [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
` (2 preceding siblings ...)
2025-05-06 14:33 ` [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
@ 2025-05-13 10:53 ` Manivannan Sadhasivam
2025-05-13 14:07 ` Niklas Cassel
2025-05-19 12:37 ` Manivannan Sadhasivam
4 siblings, 1 reply; 34+ messages in thread
From: Manivannan Sadhasivam @ 2025-05-13 10:53 UTC (permalink / raw)
To: Niklas Cassel
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
Bjorn Helgaas, Heiko Stuebner, Krishna chaitanya chundru,
Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip, linux-arm-msm
On Tue, May 06, 2025 at 09:39:35AM +0200, Niklas Cassel wrote:
> Hello there,
>
> Commit 8d3bf19f1b58 ("PCI: dwc: Don't wait for link up if driver can detect
> Link Up event") added support for DWC to not wait for link up before
> enumerating the bus. However, we cannot simply enumerate the bus after
> receiving a Link Up IRQ, we still need to wait PCIE_T_RRS_READY_MS time
> to allow a device to become ready after deasserting PERST. To avoid
> bringing back an conditional delay during probe, perform the wait in the
> threaded IRQ handler instead.
>
This wait time is a grey area in the spec tbh. If the Readiness Notification
(RN) is not supported, then the spec suggests waiting 1s for the device to
become 'configuration ready'. That's why we have the 1s delay in dwc driver.
Also, it has the below in r6.0, sec 6.6.1:
```
* On the completion of Link Training (entering the DL_Active state, see §
Section 3.2 ), a component must be able to receive and process TLPs and DLLPs.
* Following exit from a Conventional Reset of a device, within 1.0 s the device
must be able to receive a Configuration Request and return a Successful
Completion if the Request is valid. This period is independent of how quickly
Link training completes. If Readiness Notifications mechanisms are used (see
§ Section 6.22 .), this period may be shorter.
```
As per the first note, once link training is completed, the device should be
ready to accept configuration requests from the host. So no delay should be
required.
But the second note says that the 1s delay is independent of how quickly the
link training completes. This essentially contradicts with the above point.
So I think it is not required to add delay after completing the LTSSM, unless
someone sees any issue.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-13 10:53 ` Manivannan Sadhasivam
@ 2025-05-13 14:07 ` Niklas Cassel
2025-05-15 17:33 ` Laszlo Fiat
0 siblings, 1 reply; 34+ messages in thread
From: Niklas Cassel @ 2025-05-13 14:07 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
Bjorn Helgaas, Heiko Stuebner, Krishna chaitanya chundru,
Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip, linux-arm-msm
Hello Mani,
On Tue, May 13, 2025 at 11:53:29AM +0100, Manivannan Sadhasivam wrote:
>
> This wait time is a grey area in the spec tbh. If the Readiness Notification
> (RN) is not supported, then the spec suggests waiting 1s for the device to
> become 'configuration ready'. That's why we have the 1s delay in dwc driver.
>
> Also, it has the below in r6.0, sec 6.6.1:
>
> ```
> * On the completion of Link Training (entering the DL_Active state, see §
> Section 3.2 ), a component must be able to receive and process TLPs and DLLPs.
> * Following exit from a Conventional Reset of a device, within 1.0 s the device
> must be able to receive a Configuration Request and return a Successful
> Completion if the Request is valid. This period is independent of how quickly
> Link training completes. If Readiness Notifications mechanisms are used (see
> § Section 6.22 .), this period may be shorter.
> ```
>
> As per the first note, once link training is completed, the device should be
> ready to accept configuration requests from the host. So no delay should be
> required.
>
> But the second note says that the 1s delay is independent of how quickly the
> link training completes. This essentially contradicts with the above point.
>
> So I think it is not required to add delay after completing the LTSSM, unless
> someone sees any issue.
If you look at the commit message in patch 1/2, the whole reason for this
series is that someone has seen an issue :)
While I personally haven't seen any issue, the user reporting that commit
ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect
Link Up") regressed his system so that it can no longer mount rootfs
(which is on a PLEXTOR PX-256M8PeGN NVMe SSD) clearly has seen an issue.
It is possible that his device is not following the spec.
I simply compared the code before and after ec9fd499b9c6, to try to
figure out why it was actually working before, and came up with this,
which made his device functional again.
Perhaps we should add a comment above the sleep that says that this
should strictly not be needed as per the spec?
(And also add the same comment in the (single) controller driver in
mainline which already does an msleep(PCIE_T_RRS_READY_MS).)
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-13 14:07 ` Niklas Cassel
@ 2025-05-15 17:33 ` Laszlo Fiat
2025-05-16 10:00 ` Niklas Cassel
0 siblings, 1 reply; 34+ messages in thread
From: Laszlo Fiat @ 2025-05-15 17:33 UTC (permalink / raw)
To: Niklas Cassel
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Heiko Stuebner, Krishna chaitanya chundru, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip, linux-arm-msm
Hello,
On Tuesday, May 13th, 2025 at 4:07 PM, Niklas Cassel <cassel@kernel.org> wrote:
> Hello Mani,
>
> On Tue, May 13, 2025 at 11:53:29AM +0100, Manivannan Sadhasivam wrote:
>
> > This wait time is a grey area in the spec tbh. If the Readiness Notification
> > (RN) is not supported, then the spec suggests waiting 1s for the device to
> > become 'configuration ready'. That's why we have the 1s delay in dwc driver.
> >
> > Also, it has the below in r6.0, sec 6.6.1:
> >
> > `* On the completion of Link Training (entering the DL_Active state, see § Section 3.2 ), a component must be able to receive and process TLPs and DLLPs. * Following exit from a Conventional Reset of a device, within 1.0 s the device must be able to receive a Configuration Request and return a Successful Completion if the Request is valid. This period is independent of how quickly Link training completes. If Readiness Notifications mechanisms are used (see § Section 6.22 .), this period may be shorter.`
> >
> > As per the first note, once link training is completed, the device should be
> > ready to accept configuration requests from the host. So no delay should be
> > required.
> >
> > But the second note says that the 1s delay is independent of how quickly the
> > link training completes. This essentially contradicts with the above point.
> >
> > So I think it is not required to add delay after completing the LTSSM, unless
> > someone sees any issue.
>
>
> If you look at the commit message in patch 1/2, the whole reason for this
> series is that someone has seen an issue :)
>
> While I personally haven't seen any issue, the user reporting that commit
> ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect
> Link Up") regressed his system so that it can no longer mount rootfs
> (which is on a PLEXTOR PX-256M8PeGN NVMe SSD) clearly has seen an issue.
>
> It is possible that his device is not following the spec.
> I simply compared the code before and after ec9fd499b9c6, to try to
> figure out why it was actually working before, and came up with this,
> which made his device functional again.
>
> Perhaps we should add a comment above the sleep that says that this
> should strictly not be needed as per the spec?
> (And also add the same comment in the (single) controller driver in
> mainline which already does an msleep(PCIE_T_RRS_READY_MS).)
I am the one experiencing the issue with my Orange PI 3B (RK3566, 8 GB RAM) and a PLEXTOR PX-256M8PeGN NVMe SSD.
I first detected the problem while upgrading from 6.13.8 to 6.14.3, that my system cannot find the NVME SSD which contains the rootfs. After reverting the two patches:
- ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
- 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
my system booted fine again.
After that I tested the patches sent by Niklas in this thread, which fixed the issue, so I sent Tested-by.
I did another test Today with 6.15.0-rc6, which in itself does not find my SSD. Niklas asked me to test with these
- revert ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
- revert 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
- apply the following patch:
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index b3615d125942..5dee689ecd95 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
if (dw_pcie_link_up(pci))
break;
- msleep(LINK_WAIT_SLEEP_MS);
+ usleep_range(100, 200);
}
if (retries >= LINK_WAIT_MAX_RETRIES) {
which restores the original behaviour to wait for link-up, then shorten the time. This resulted again a non booting system, this time with "Phy link never came up" error message.
So please allow to fix the regression that is already in 6.14.x. I now so far only I have reported this, but we cannot be sure how many SSDs have this timing issue. Most users use older, distribution packaged kernels, so others will face this later.
Bye,
Laszlo Fiat
>
>
> Kind regards,
> Niklas
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-15 17:33 ` Laszlo Fiat
@ 2025-05-16 10:00 ` Niklas Cassel
2025-05-16 18:48 ` Laszlo Fiat
0 siblings, 1 reply; 34+ messages in thread
From: Niklas Cassel @ 2025-05-16 10:00 UTC (permalink / raw)
To: Laszlo Fiat
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Heiko Stuebner, Krishna chaitanya chundru, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip, linux-arm-msm
On Thu, May 15, 2025 at 05:33:41PM +0000, Laszlo Fiat wrote:
> I am the one experiencing the issue with my Orange PI 3B (RK3566, 8 GB RAM) and a PLEXTOR PX-256M8PeGN NVMe SSD.
>
> I first detected the problem while upgrading from 6.13.8 to 6.14.3, that my system cannot find the NVME SSD which contains the rootfs. After reverting the two patches:
>
> - ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
> - 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
>
> my system booted fine again.
> After that I tested the patches sent by Niklas in this thread, which fixed the issue, so I sent Tested-by.
>
> I did another test Today with 6.15.0-rc6, which in itself does not find my SSD. Niklas asked me to test with these
>
> - revert ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
> - revert 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
> - apply the following patch:
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index b3615d125942..5dee689ecd95 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
> if (dw_pcie_link_up(pci))
> break;
>
> - msleep(LINK_WAIT_SLEEP_MS);
> + usleep_range(100, 200);
> }
>
> if (retries >= LINK_WAIT_MAX_RETRIES) {
>
>
> which restores the original behaviour to wait for link-up, then shorten the time. This resulted again a non booting system, this time with "Phy link never came up" error message.
That message was unexpected.
What I expected to happen was that the link would come up, but by reducing
delay between "link is up" and device is accessed (from 90 ms -> 100 us),
I was that you would see the same problem on "older" kernels as you do with
the "link up IRQ" patches (which originally had no delay, but this series
basically re-added the same delay (PCIE_T_RRS_READY_MS, 100 ms) as we had
before (LINK_WAIT_SLEEP_MS, 90 ms).
But I see the problem with the test code that I asked you to test to verify
that this problem also existed before (if you had a shorter delay).
(By reducing the delay, the LINK_WAIT_MAX_RETRIES also need to be bumped.)
Could you please test:
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index b3615d125942..5dee689ecd95 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
if (dw_pcie_link_up(pci))
break;
- msleep(LINK_WAIT_SLEEP_MS);
+ usleep_range(100, 200);
}
if (retries >= LINK_WAIT_MAX_RETRIES) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 4dd16aa4b39e..8422661b79d5 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -61,7 +61,7 @@
set_bit(DW_PCIE_CAP_ ## _cap, &(_pci)->caps)
/* Parameters for the waiting for link up routine */
-#define LINK_WAIT_MAX_RETRIES 10
+#define LINK_WAIT_MAX_RETRIES 10000
#define LINK_WAIT_SLEEP_MS 90
/* Parameters for the waiting for iATU enabled routine */
On top of an old kernel instead?
We expect the link to come up, but that you will not be able to mount rootfs.
If that is the case, we are certain that the this patch series is 100% needed
for your device to have the same functional behavior as before.
Kind regards,
Niklas
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-16 10:00 ` Niklas Cassel
@ 2025-05-16 18:48 ` Laszlo Fiat
2025-05-19 9:44 ` Niklas Cassel
0 siblings, 1 reply; 34+ messages in thread
From: Laszlo Fiat @ 2025-05-16 18:48 UTC (permalink / raw)
To: Niklas Cassel
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Heiko Stuebner, Krishna chaitanya chundru, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Krzysztof Wilczyński,
linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-rockchip@lists.infradead.org, linux-arm-msm@vger.kernel.org
Hello,
-------- Original Message --------
On 16/05/2025 12:00, Niklas Cassel <cassel@kernel.org> wrote:
> On Thu, May 15, 2025 at 05:33:41PM +0000, Laszlo Fiat wrote:
> > I am the one experiencing the issue with my Orange PI 3B (RK3566, 8 GB RAM) and a PLEXTOR PX-256M8PeGN NVMe SSD.
> >
> > I first detected the problem while upgrading from 6.13.8 to 6.14.3, that my system cannot find the NVME SSD which contains the rootfs. After reverting the two patches:
> >
> > - ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
> > - 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
> >
> > my system booted fine again.
> > After that I tested the patches sent by Niklas in this thread, which fixed the issue, so I sent Tested-by.
> >
> > I did another test Today with 6.15.0-rc6, which in itself does not find my SSD. Niklas asked me to test with these
> >
> > - revert ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
> > - revert 0e0b45ab5d77 ("PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ")
> > - apply the following patch:
> >
> > diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> > index b3615d125942..5dee689ecd95 100644
> > --- a/drivers/pci/controller/dwc/pcie-designware.c
> > +++ b/drivers/pci/controller/dwc/pcie-designware.c
> > @@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
> > if (dw_pcie_link_up(pci))
> > break;
> >
> > - msleep(LINK_WAIT_SLEEP_MS);
> > + usleep_range(100, 200);
> > }
> >
> > if (retries >= LINK_WAIT_MAX_RETRIES) {
> >
> >
> > which restores the original behaviour to wait for link-up, then shorten the time. This resulted again a non booting system, this time with "Phy link never came up" error message.
>
> That message was unexpected.
>
> What I expected to happen was that the link would come up, but by reducing
> delay between "link is up" and device is accessed (from 90 ms -> 100 us),
> I was that you would see the same problem on "older" kernels as you do with
> the "link up IRQ" patches (which originally had no delay, but this series
> basically re-added the same delay (PCIE_T_RRS_READY_MS, 100 ms) as we had
> before (LINK_WAIT_SLEEP_MS, 90 ms).
>
> But I see the problem with the test code that I asked you to test to verify
> that this problem also existed before (if you had a shorter delay).
> (By reducing the delay, the LINK_WAIT_MAX_RETRIES also need to be bumped.)
>
> Could you please test:
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
> index b3615d125942..5dee689ecd95 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -692,7 +692,7 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
> if (dw_pcie_link_up(pci))
> break;
>
> - msleep(LINK_WAIT_SLEEP_MS);
> + usleep_range(100, 200);
> }
>
> if (retries >= LINK_WAIT_MAX_RETRIES) {
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index 4dd16aa4b39e..8422661b79d5 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -61,7 +61,7 @@
> set_bit(DW_PCIE_CAP_ ## _cap, &(_pci)->caps)
>
> /* Parameters for the waiting for link up routine */
> -#define LINK_WAIT_MAX_RETRIES 10
> +#define LINK_WAIT_MAX_RETRIES 10000
> #define LINK_WAIT_SLEEP_MS 90
>
> /* Parameters for the waiting for iATU enabled routine */
>
>
> On top of an old kernel instead?
>
I have compiled a vanilla 6.12.28, that booted fine, as expeced. Then compiled a version with the patch directly above.
> We expect the link to come up, but that you will not be able to mount rootfs.
>
That is exactly what happened.
> If that is the case, we are certain that the this patch series is 100% needed
> for your device to have the same functional behavior as before.
That is the case.
Bye,
Laszlo Fiat
>
>
> Kind regards,
> Niklas
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-16 18:48 ` Laszlo Fiat
@ 2025-05-19 9:44 ` Niklas Cassel
2025-05-19 12:10 ` Manivannan Sadhasivam
0 siblings, 1 reply; 34+ messages in thread
From: Niklas Cassel @ 2025-05-19 9:44 UTC (permalink / raw)
To: Laszlo Fiat, Manivannan Sadhasivam, Krzysztof Wilczyński
Cc: Lorenzo Pieralisi, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Krishna chaitanya chundru, Wilfred Mallawa, Damien Le Moal,
Hans Zhang, Krzysztof Wilczyński, linux-pci@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-rockchip@lists.infradead.org, linux-arm-msm@vger.kernel.org
On Fri, May 16, 2025 at 06:48:59PM +0000, Laszlo Fiat wrote:
>
> I have compiled a vanilla 6.12.28, that booted fine, as expeced. Then compiled a version with the patch directly above.
>
> > We expect the link to come up, but that you will not be able to mount rootfs.
> >
>
> That is exactly what happened.
>
> > If that is the case, we are certain that the this patch series is 100% needed
> > for your device to have the same functional behavior as before.
>
> That is the case.
Laszlo,
Thank you for verifying!
>
> Bye,
>
> Laszlo Fiat
Mani, Krzysztof,
It does indeed seem like the reason for this regression is that this weird
NVMe drive requires some additional time after link up to be ready.
This series does re-add a similar delay, but in the 'link up' IRQ thread,
so even with this series, things are nicer than before (where we had an
unconditional delay after starting the link).
Do you require any additional changes?
Otherwise, I think it would be nice if we could get this queued up before
the merge window.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-19 9:44 ` Niklas Cassel
@ 2025-05-19 12:10 ` Manivannan Sadhasivam
0 siblings, 0 replies; 34+ messages in thread
From: Manivannan Sadhasivam @ 2025-05-19 12:10 UTC (permalink / raw)
To: Niklas Cassel
Cc: Laszlo Fiat, Krzysztof Wilczyński, Lorenzo Pieralisi,
Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Krishna chaitanya chundru, Wilfred Mallawa, Damien Le Moal,
Hans Zhang, Krzysztof Wilczyński, linux-pci@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-rockchip@lists.infradead.org, linux-arm-msm@vger.kernel.org
On Mon, May 19, 2025 at 11:44:17AM +0200, Niklas Cassel wrote:
> On Fri, May 16, 2025 at 06:48:59PM +0000, Laszlo Fiat wrote:
> >
> > I have compiled a vanilla 6.12.28, that booted fine, as expeced. Then compiled a version with the patch directly above.
> >
> > > We expect the link to come up, but that you will not be able to mount rootfs.
> > >
> >
> > That is exactly what happened.
> >
> > > If that is the case, we are certain that the this patch series is 100% needed
> > > for your device to have the same functional behavior as before.
> >
> > That is the case.
>
> Laszlo,
>
> Thank you for verifying!
>
> >
> > Bye,
> >
> > Laszlo Fiat
>
> Mani, Krzysztof,
>
> It does indeed seem like the reason for this regression is that this weird
> NVMe drive requires some additional time after link up to be ready.
>
> This series does re-add a similar delay, but in the 'link up' IRQ thread,
> so even with this series, things are nicer than before (where we had an
> unconditional delay after starting the link).
>
> Do you require any additional changes?
>
No. Sorry for the delay in getting back. I was at Linaro Connect last week and
had to take some vacation for the first half of this week.
It is fine with me to add the delay only in the Rockchip platform where the
issue was reported (next time please mention the issue that triggered the series
in the cover letter). But I do not want to add the same in Qcom platform as the
delay is not strictly required as per the spec (if I understood it properly).
So I'll merge rest of the 3 patches.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes
2025-05-06 7:39 [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
` (3 preceding siblings ...)
2025-05-13 10:53 ` Manivannan Sadhasivam
@ 2025-05-19 12:37 ` Manivannan Sadhasivam
4 siblings, 0 replies; 34+ messages in thread
From: Manivannan Sadhasivam @ 2025-05-19 12:37 UTC (permalink / raw)
To: Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
Bjorn Helgaas, Heiko Stuebner, Krishna chaitanya chundru,
Niklas Cassel
Cc: Manivannan Sadhasivam, Wilfred Mallawa, Damien Le Moal,
Hans Zhang, Laszlo Fiat, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip, linux-arm-msm
On Tue, 06 May 2025 09:39:35 +0200, Niklas Cassel wrote:
> Commit 8d3bf19f1b58 ("PCI: dwc: Don't wait for link up if driver can detect
> Link Up event") added support for DWC to not wait for link up before
> enumerating the bus. However, we cannot simply enumerate the bus after
> receiving a Link Up IRQ, we still need to wait PCIE_T_RRS_READY_MS time
> to allow a device to become ready after deasserting PERST. To avoid
> bringing back an conditional delay during probe, perform the wait in the
> threaded IRQ handler instead.
>
> [...]
Applied to controller/dw-rockchip, thanks!
[1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
commit: c1c9c4b6130defb671c45b53ec12fcd92afea971
[3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro
commit: 9ad44a6f6f8775786a9543427c82625d96a340b4
[4/4] PCI: qcom: Replace PERST sleep time with proper macro
commit: be0b42befa87d81780aa0f003644b0ad520b0234
(I do need to change the name of the topic-branch now...)
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-06 7:39 ` [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready Niklas Cassel
2025-05-06 11:32 ` Laszlo Fiat
2025-05-06 22:23 ` Wilfred Mallawa
@ 2025-05-28 22:42 ` Bjorn Helgaas
2025-05-30 13:57 ` Niklas Cassel
2025-05-30 15:59 ` Manivannan Sadhasivam
2 siblings, 2 replies; 34+ messages in thread
From: Bjorn Helgaas @ 2025-05-28 22:42 UTC (permalink / raw)
To: Niklas Cassel
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Tue, May 06, 2025 at 09:39:36AM +0200, Niklas Cassel wrote:
> Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can
> detect Link Up") changed so that we no longer call dw_pcie_wait_for_link(),
> and instead enumerate the bus when receiving a Link Up IRQ.
>
> Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe SSD is
> no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
> dw-rockchip: Don't wait for link since we can detect Link Up") makes his
> SSD functional again.
>
> It seems that we are enumerating the bus before the endpoint is ready.
> Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> threaded IRQ handler makes the SSD functional once again.
This sounds like a problem that could happen with any controller, not
just dw-rockchip? Are we missing some required delay that should be
in generic code? Or is this a PLEXTOR defect that everybody has to
pay the price for?
Delays like this are really hard to get rid of once we add them, so
I'm a little bit cautious.
> What appears to happen is that before ec9fd499b9c6, we called
> dw_pcie_wait_for_link(), and in the first iteration of the loop, the link
> will never be up (because the link was just started),
> dw_pcie_wait_for_link() will then sleep for LINK_WAIT_SLEEP_MS (90 ms),
> before trying again.
>
> This means that even if a driver was missing a msleep(PCIE_T_RRS_READY_MS)
> (100 ms), because of the call to dw_pcie_wait_for_link(), enumerating the
> bus would essentially be delayed by that time anyway (because of the sleep
> LINK_WAIT_SLEEP_MS (90 ms)).
>
> While we could add the msleep(PCIE_T_RRS_READY_MS) after deasserting PERST,
> that would essentially bring back an unconditional delay during synchronous
> probe (the whole reason to use a Link Up IRQ was to avoid an unconditional
> delay during probe).
>
> Thus, add the msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> IRQ handler. This way, we will not have an unconditional delay during boot
> for unpopulated PCIe slots.
>
> Cc: Laszlo Fiat <laszlo.fiat@proton.me>
> Fixes: ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can detect Link Up")
> Signed-off-by: Niklas Cassel <cassel@kernel.org>
> ---
> drivers/pci/controller/dwc/pcie-dw-rockchip.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> index 3c6ab71c996e..6a7ec3545a7f 100644
> --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> @@ -23,6 +23,7 @@
> #include <linux/reset.h>
>
> #include "pcie-designware.h"
> +#include "../../pci.h"
>
> /*
> * The upper 16 bits of PCIE_CLIENT_CONFIG are a write
> @@ -458,6 +459,7 @@ static irqreturn_t rockchip_pcie_rc_sys_irq_thread(int irq, void *arg)
> if (reg & PCIE_RDLH_LINK_UP_CHGED) {
> if (rockchip_pcie_link_up(pci)) {
> dev_dbg(dev, "Received Link up event. Starting enumeration!\n");
> + msleep(PCIE_T_RRS_READY_MS);
> /* Rescan the bus to enumerate endpoint devices */
> pci_lock_rescan_remove();
> pci_rescan_bus(pp->bridge->bus);
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-28 22:42 ` Bjorn Helgaas
@ 2025-05-30 13:57 ` Niklas Cassel
2025-05-30 15:21 ` Bjorn Helgaas
2025-05-30 15:59 ` Manivannan Sadhasivam
1 sibling, 1 reply; 34+ messages in thread
From: Niklas Cassel @ 2025-05-30 13:57 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Wed, May 28, 2025 at 05:42:51PM -0500, Bjorn Helgaas wrote:
> On Tue, May 06, 2025 at 09:39:36AM +0200, Niklas Cassel wrote:
> > Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can
> > detect Link Up") changed so that we no longer call dw_pcie_wait_for_link(),
> > and instead enumerate the bus when receiving a Link Up IRQ.
> >
> > Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe SSD is
> > no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
> > dw-rockchip: Don't wait for link since we can detect Link Up") makes his
> > SSD functional again.
> >
> > It seems that we are enumerating the bus before the endpoint is ready.
> > Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> > threaded IRQ handler makes the SSD functional once again.
>
> This sounds like a problem that could happen with any controller, not
> just dw-rockchip?
Correct.
> Are we missing some required delay that should be
> in generic code? Or is this a PLEXTOR defect that everybody has to
> pay the price for?
So far, the Plextor drive is the only endpoint that we know of, which is
not working without the delay.
We have no idea if this Plextor drive is the only bad behaving endpoint or
if there are many endpoints with similar issues, because before the
use_linkup_irq() callback was introduced, we always had (the equivalent of)
this delay.
Since this will only delay once the link up IRQ is triggered, it will not
affect the boot time when there are no endpoint plugged in to the slot, so
it seemed quite harmless to reintroduce this delay before enumeration.
But other suggestions are of course welcome.
Since it seems that we can read the PCI vendor and PCI device ID, it seems
that at least some config space reads seem to work, so I guess that we could
try to quirk the PCI vendor and PCI device ID in the nvme driver.
The nvme driver does have a NVME_QUIRK_DELAY_BEFORE_CHK_RDY quirk for some
Samsung drive, perhaps we could try something similar to the Plextor drive?
I don't personally have this problematic NVMe drive, so I am not able to test
such a patch. The user who reported the problem, Laszlo, has been doing all
the testing.
Perhaps Laszlo could try something like:
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 6b04473c0ab7..9c409af34548 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2865,6 +2865,12 @@ static const struct nvme_core_quirk_entry core_quirks[] = {
.quirks = NVME_QUIRK_DELAY_BEFORE_CHK_RDY |
NVME_QUIRK_NO_DEEPEST_PS |
NVME_QUIRK_IGNORE_DEV_SUBNQN,
+ },
+ {
+
+ .vid = 0x144d,
+ .mn = "Samsung Portable SSD X5",
+ .quirks = NVME_QUIRK_DELAY_BEFORE_CHK_RDY,
}
};
.. with the .vid and .mn fields replacd with the correct ones for the Plextor
drive. (Don't forget to revert patch in $subject when testing this alternate
solution.)
I don't have a preference for either solution.
Kind regards,
Niklas
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-30 13:57 ` Niklas Cassel
@ 2025-05-30 15:21 ` Bjorn Helgaas
0 siblings, 0 replies; 34+ messages in thread
From: Bjorn Helgaas @ 2025-05-30 15:21 UTC (permalink / raw)
To: Niklas Cassel
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński,
Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas, Heiko Stuebner,
Wilfred Mallawa, Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Fri, May 30, 2025 at 03:57:14PM +0200, Niklas Cassel wrote:
> On Wed, May 28, 2025 at 05:42:51PM -0500, Bjorn Helgaas wrote:
> > On Tue, May 06, 2025 at 09:39:36AM +0200, Niklas Cassel wrote:
> > > Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can
> > > detect Link Up") changed so that we no longer call dw_pcie_wait_for_link(),
> > > and instead enumerate the bus when receiving a Link Up IRQ.
> > >
> > > Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe SSD is
> > > no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
> > > dw-rockchip: Don't wait for link since we can detect Link Up") makes his
> > > SSD functional again.
> > >
> > > It seems that we are enumerating the bus before the endpoint is ready.
> > > Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> > > threaded IRQ handler makes the SSD functional once again.
> >
> > This sounds like a problem that could happen with any controller, not
> > just dw-rockchip?
>
> Correct.
>
> > Are we missing some required delay that should be
> > in generic code? Or is this a PLEXTOR defect that everybody has to
> > pay the price for?
>
> So far, the Plextor drive is the only endpoint that we know of, which is
> not working without the delay.
>
> We have no idea if this Plextor drive is the only bad behaving endpoint or
> if there are many endpoints with similar issues, because before the
> use_linkup_irq() callback was introduced, we always had (the equivalent of)
> this delay.
>
> Since this will only delay once the link up IRQ is triggered, it will not
> affect the boot time when there are no endpoint plugged in to the slot, so
> it seemed quite harmless to reintroduce this delay before enumeration.
>
> But other suggestions are of course welcome.
>
>
> Since it seems that we can read the PCI vendor and PCI device ID, it seems
> that at least some config space reads seem to work, so I guess that we could
> try to quirk the PCI vendor and PCI device ID in the nvme driver.
>
> The nvme driver does have a NVME_QUIRK_DELAY_BEFORE_CHK_RDY quirk for some
> Samsung drive, perhaps we could try something similar to the Plextor drive?
Looks like NVME_QUIRK_DELAY_BEFORE_CHK_RDY is used for Seagate (Vendor
ID 0x1bb1), Samsung (0x144d), Memblaze (0x1c5f), and HGST/WDC (0x1c58,
annoyingly not shown as reserved by
https://pcisig.com/membership/member-companies?combine=1c58) adapters.
If we can read the Vendor and Device IDs (and, I assume, BARs and
other things used by enumeration), it sounds like possibly an NVMe
issue, not a PCI issue.
If we can use NVME_QUIRK_DELAY_BEFORE_CHK_RDY for this drive as well,
I think we should because it at least connects the extra delay with a
specific device.
> I don't personally have this problematic NVMe drive, so I am not able to test
> such a patch. The user who reported the problem, Laszlo, has been doing all
> the testing.
>
>
> Perhaps Laszlo could try something like:
>
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 6b04473c0ab7..9c409af34548 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2865,6 +2865,12 @@ static const struct nvme_core_quirk_entry core_quirks[] = {
> .quirks = NVME_QUIRK_DELAY_BEFORE_CHK_RDY |
> NVME_QUIRK_NO_DEEPEST_PS |
> NVME_QUIRK_IGNORE_DEV_SUBNQN,
> + },
> + {
> +
> + .vid = 0x144d,
> + .mn = "Samsung Portable SSD X5",
> + .quirks = NVME_QUIRK_DELAY_BEFORE_CHK_RDY,
> }
> };
>
>
> .. with the .vid and .mn fields replacd with the correct ones for the Plextor
> drive. (Don't forget to revert patch in $subject when testing this alternate
> solution.)
>
> I don't have a preference for either solution.
>
>
> Kind regards,
> Niklas
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-28 22:42 ` Bjorn Helgaas
2025-05-30 13:57 ` Niklas Cassel
@ 2025-05-30 15:59 ` Manivannan Sadhasivam
2025-05-30 17:19 ` Bjorn Helgaas
1 sibling, 1 reply; 34+ messages in thread
From: Manivannan Sadhasivam @ 2025-05-30 15:59 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Niklas Cassel, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, Bjorn Helgaas, Heiko Stuebner, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Wed, May 28, 2025 at 05:42:51PM -0500, Bjorn Helgaas wrote:
> On Tue, May 06, 2025 at 09:39:36AM +0200, Niklas Cassel wrote:
> > Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can
> > detect Link Up") changed so that we no longer call dw_pcie_wait_for_link(),
> > and instead enumerate the bus when receiving a Link Up IRQ.
> >
> > Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe SSD is
> > no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
> > dw-rockchip: Don't wait for link since we can detect Link Up") makes his
> > SSD functional again.
> >
> > It seems that we are enumerating the bus before the endpoint is ready.
> > Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> > threaded IRQ handler makes the SSD functional once again.
>
> This sounds like a problem that could happen with any controller, not
> just dw-rockchip? Are we missing some required delay that should be
> in generic code? Or is this a PLEXTOR defect that everybody has to
> pay the price for?
>
> Delays like this are really hard to get rid of once we add them, so
> I'm a little bit cautious.
>
Ok, I digged into the spec a little more and I could see below paragraph in
r6.0, sec 6.6.1 for devices not supporting Device Readiness Status (DRS):
"With a Downstream Port that does not support Link speeds greater than 5.0 GT/s,
software must wait a minimum of 100 ms following exit from a Conventional Reset
before sending a Configuration Request to the device immediately below that
Port.
With a Downstream Port that supports Link speeds greater than 5.0 GT/s,
software must wait a minimum of 100 ms after Link training completes before
sending a Configuration Request to the device immediately below that Port.
Software can determine when Link training completes by polling the Data Link
Layer Link Active bit or by setting up an associated interrupt
(see § Section 6.7.3.3 ). It is strongly recommended for software to use this
mechanism whenever the Downstream Port supports it."
We are not checking for DRS after the PERST# deassert or after link is up, I
think DRS check only applies to enumerated devices, but I'm not 100% sure. But
if we assume that the devices doesn't support DRS, then we should make sure
that all controller drivers wait for 100ms even after link up event before
issuing the config request.
So I don't think this is a device specific issue but rather controller specific.
And this makes the Qcom patch that I dropped a valid one (ofc with change in
description).
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-30 15:59 ` Manivannan Sadhasivam
@ 2025-05-30 17:19 ` Bjorn Helgaas
2025-05-30 17:24 ` Niklas Cassel
0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2025-05-30 17:19 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Niklas Cassel, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, Bjorn Helgaas, Heiko Stuebner, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Fri, May 30, 2025 at 09:29:57PM +0530, Manivannan Sadhasivam wrote:
> On Wed, May 28, 2025 at 05:42:51PM -0500, Bjorn Helgaas wrote:
> > On Tue, May 06, 2025 at 09:39:36AM +0200, Niklas Cassel wrote:
> > > Commit ec9fd499b9c6 ("PCI: dw-rockchip: Don't wait for link since we can
> > > detect Link Up") changed so that we no longer call dw_pcie_wait_for_link(),
> > > and instead enumerate the bus when receiving a Link Up IRQ.
> > >
> > > Laszlo Fiat reported (off-list) that his PLEXTOR PX-256M8PeGN NVMe SSD is
> > > no longer functional, and simply reverting commit ec9fd499b9c6 ("PCI:
> > > dw-rockchip: Don't wait for link since we can detect Link Up") makes his
> > > SSD functional again.
> > >
> > > It seems that we are enumerating the bus before the endpoint is ready.
> > > Adding a msleep(PCIE_T_RRS_READY_MS) before enumerating the bus in the
> > > threaded IRQ handler makes the SSD functional once again.
> >
> > This sounds like a problem that could happen with any controller, not
> > just dw-rockchip? Are we missing some required delay that should be
> > in generic code? Or is this a PLEXTOR defect that everybody has to
> > pay the price for?
> >
> > Delays like this are really hard to get rid of once we add them, so
> > I'm a little bit cautious.
>
> Ok, I digged into the spec a little more and I could see below
> paragraph in r6.0, sec 6.6.1 for devices not supporting Device
> Readiness Status (DRS):
>
> "With a Downstream Port that does not support Link speeds greater
> than 5.0 GT/s, software must wait a minimum of 100 ms following exit
> from a Conventional Reset before sending a Configuration Request to
> the device immediately below that Port.
>
> With a Downstream Port that supports Link speeds greater than 5.0
> GT/s, software must wait a minimum of 100 ms after Link training
> completes before sending a Configuration Request to the device
> immediately below that Port. Software can determine when Link
> training completes by polling the Data Link Layer Link Active bit or
> by setting up an associated interrupt (see § Section 6.7.3.3 ). It
> is strongly recommended for software to use this mechanism whenever
> the Downstream Port supports it."
>
> We are not checking for DRS after the PERST# deassert or after link
> is up, I think DRS check only applies to enumerated devices, but I'm
> not 100% sure. But if we assume that the devices doesn't support
> DRS, then we should make sure that all controller drivers wait for
> 100ms even after link up event before issuing the config request.
I don't think we have any DRS support yet.
I think all drivers should wait PCIE_T_RRS_READY_MS (100ms) after exit
from Conventional Reset (if port only supports <= 5.0 GT/s) or after
link training completes (if port supports > 5.0 GT/s).
> So I don't think this is a device specific issue but rather
> controller specific. And this makes the Qcom patch that I dropped a
> valid one (ofc with change in description).
URL?
Bjorn
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-30 17:19 ` Bjorn Helgaas
@ 2025-05-30 17:24 ` Niklas Cassel
2025-05-30 19:43 ` Bjorn Helgaas
0 siblings, 1 reply; 34+ messages in thread
From: Niklas Cassel @ 2025-05-30 17:24 UTC (permalink / raw)
To: Bjorn Helgaas, Manivannan Sadhasivam
Cc: Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
Bjorn Helgaas, Heiko Stuebner, Wilfred Mallawa, Damien Le Moal,
Hans Zhang, Laszlo Fiat, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip
On 30 May 2025 19:19:37 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
>
>I think all drivers should wait PCIE_T_RRS_READY_MS (100ms) after exit
>from Conventional Reset (if port only supports <= 5.0 GT/s) or after
>link training completes (if port supports > 5.0 GT/s).
>
>> So I don't think this is a device specific issue but rather
>> controller specific. And this makes the Qcom patch that I dropped a
>> valid one (ofc with change in description).
>
>URL?
PATCH 4/4 of this series.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-30 17:24 ` Niklas Cassel
@ 2025-05-30 19:43 ` Bjorn Helgaas
2025-05-31 6:47 ` Manivannan Sadhasivam
0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2025-05-30 19:43 UTC (permalink / raw)
To: Niklas Cassel
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Heiko Stuebner, Wilfred Mallawa, Damien Le Moal, Hans Zhang,
Laszlo Fiat, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip
On Fri, May 30, 2025 at 07:24:53PM +0200, Niklas Cassel wrote:
> On 30 May 2025 19:19:37 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >
> >I think all drivers should wait PCIE_T_RRS_READY_MS (100ms) after exit
> >from Conventional Reset (if port only supports <= 5.0 GT/s) or after
> >link training completes (if port supports > 5.0 GT/s).
> >
> >> So I don't think this is a device specific issue but rather
> >> controller specific. And this makes the Qcom patch that I dropped a
> >> valid one (ofc with change in description).
> >
> >URL?
>
> PATCH 4/4 of this series.
If you mean
https://lore.kernel.org/r/20250506073934.433176-10-cassel@kernel.org,
that patch merely replaces "100" with PCIE_T_PVPERL_MS, which doesn't
fix anything and is valid regardless of this Plextor-related patch
("PCI: dw-rockchip: Do not enumerate bus before endpoint devices are
ready").
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-30 19:43 ` Bjorn Helgaas
@ 2025-05-31 6:47 ` Manivannan Sadhasivam
2025-06-03 14:08 ` Niklas Cassel
0 siblings, 1 reply; 34+ messages in thread
From: Manivannan Sadhasivam @ 2025-05-31 6:47 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Niklas Cassel, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, Bjorn Helgaas, Heiko Stuebner, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Fri, May 30, 2025 at 02:43:47PM -0500, Bjorn Helgaas wrote:
> On Fri, May 30, 2025 at 07:24:53PM +0200, Niklas Cassel wrote:
> > On 30 May 2025 19:19:37 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > >
> > >I think all drivers should wait PCIE_T_RRS_READY_MS (100ms) after exit
> > >from Conventional Reset (if port only supports <= 5.0 GT/s) or after
> > >link training completes (if port supports > 5.0 GT/s).
> > >
> > >> So I don't think this is a device specific issue but rather
> > >> controller specific. And this makes the Qcom patch that I dropped a
> > >> valid one (ofc with change in description).
> > >
> > >URL?
> >
> > PATCH 4/4 of this series.
>
> If you mean
> https://lore.kernel.org/r/20250506073934.433176-10-cassel@kernel.org,
> that patch merely replaces "100" with PCIE_T_PVPERL_MS, which doesn't
> fix anything and is valid regardless of this Plextor-related patch
> ("PCI: dw-rockchip: Do not enumerate bus before endpoint devices are
> ready").
It is patch 2/4:
https://lore.kernel.org/all/20250506073934.433176-8-cassel@kernel.org
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-05-31 6:47 ` Manivannan Sadhasivam
@ 2025-06-03 14:08 ` Niklas Cassel
2025-06-03 18:12 ` Bjorn Helgaas
0 siblings, 1 reply; 34+ messages in thread
From: Niklas Cassel @ 2025-06-03 14:08 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, Bjorn Helgaas, Heiko Stuebner, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Sat, May 31, 2025 at 12:17:43PM +0530, Manivannan Sadhasivam wrote:
> On Fri, May 30, 2025 at 02:43:47PM -0500, Bjorn Helgaas wrote:
> > On Fri, May 30, 2025 at 07:24:53PM +0200, Niklas Cassel wrote:
> > > On 30 May 2025 19:19:37 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > >
> > > >I think all drivers should wait PCIE_T_RRS_READY_MS (100ms) after exit
> > > >from Conventional Reset (if port only supports <= 5.0 GT/s) or after
> > > >link training completes (if port supports > 5.0 GT/s).
> > > >
> > > >> So I don't think this is a device specific issue but rather
> > > >> controller specific. And this makes the Qcom patch that I dropped a
> > > >> valid one (ofc with change in description).
> > > >
> > > >URL?
> > >
> > > PATCH 4/4 of this series.
> >
> > If you mean
> > https://lore.kernel.org/r/20250506073934.433176-10-cassel@kernel.org,
> > that patch merely replaces "100" with PCIE_T_PVPERL_MS, which doesn't
> > fix anything and is valid regardless of this Plextor-related patch
> > ("PCI: dw-rockchip: Do not enumerate bus before endpoint devices are
> > ready").
>
> It is patch 2/4:
> https://lore.kernel.org/all/20250506073934.433176-8-cassel@kernel.org
Hello all,
I'm getting some mixed messages here.
If I understand Bjorn correctly, he would prefer a NVMe quirk, and looking
at pci/next, PATCH 1/4 has been dropped.
If I understand Mani correctly, he thinks that we should queue up PATCH 1/4
and PATCH 2/4 (although with modified commit messages).
So, what is the consensus here?
As you know, I do not have the (problematic) Plextor drive, so we go with
the quirk option, then we would need to ask Laszlo nicely to retest.
(And to provide the PCI device and PCI vendor ID of his NVMe device so we
can write a quirk.)
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-06-03 14:08 ` Niklas Cassel
@ 2025-06-03 18:12 ` Bjorn Helgaas
2025-06-04 11:40 ` Niklas Cassel
0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2025-06-03 18:12 UTC (permalink / raw)
To: Niklas Cassel
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Heiko Stuebner, Wilfred Mallawa, Damien Le Moal, Hans Zhang,
Laszlo Fiat, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip
On Tue, Jun 03, 2025 at 04:08:15PM +0200, Niklas Cassel wrote:
> On Sat, May 31, 2025 at 12:17:43PM +0530, Manivannan Sadhasivam wrote:
> > On Fri, May 30, 2025 at 02:43:47PM -0500, Bjorn Helgaas wrote:
> > > On Fri, May 30, 2025 at 07:24:53PM +0200, Niklas Cassel wrote:
> > > > On 30 May 2025 19:19:37 CEST, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > >
> > > > >I think all drivers should wait PCIE_T_RRS_READY_MS (100ms) after exit
> > > > >from Conventional Reset (if port only supports <= 5.0 GT/s) or after
> > > > >link training completes (if port supports > 5.0 GT/s).
> > > > >
> > > > >> So I don't think this is a device specific issue but rather
> > > > >> controller specific. And this makes the Qcom patch that I dropped a
> > > > >> valid one (ofc with change in description).
> > > > >
> > > > >URL?
> > > >
> > > > PATCH 4/4 of this series.
> > >
> > > If you mean
> > > https://lore.kernel.org/r/20250506073934.433176-10-cassel@kernel.org,
> > > that patch merely replaces "100" with PCIE_T_PVPERL_MS, which doesn't
> > > fix anything and is valid regardless of this Plextor-related patch
> > > ("PCI: dw-rockchip: Do not enumerate bus before endpoint devices are
> > > ready").
> >
> > It is patch 2/4:
> > https://lore.kernel.org/all/20250506073934.433176-8-cassel@kernel.org
>
> Hello all,
>
> I'm getting some mixed messages here.
>
> If I understand Bjorn correctly, he would prefer a NVMe quirk, and looking
> at pci/next, PATCH 1/4 has been dropped.
Hmmm, sorry, I misinterpreted both 1/4 and 2/4. I read them as "add
this delay so the PLEXTOR device works", but in fact, I think in both
cases, the delay is actually to enforce the PCIe r6.0, sec 6.6.1,
requirement for software to wait 100ms before issuing a config
request, and the fact that it makes PLEXTOR work is a side effect of
that.
The beginning of that 100ms delay is "exit from Conventional Reset"
(ports that support <= 5.0 GT/s) or "link training completes" (ports
that support > 5.0 GT/s).
I think we lack that 100ms delay in dwc drivers in general. The only
generic dwc delay is in dw_pcie_host_init() via the LINK_WAIT_SLEEP_MS
in dw_pcie_wait_for_link(), but that doesn't count because it's
*before* the link comes up. We have to wait 100ms *after* exiting
Conventional Reset or completing link training.
We don't know when the exit from Conventional Reset was, but it was
certainly before the link came up. In the absence of a timestamp for
exit from reset, starting the wait after link-up is probably the best
we can do. This could be either after dw_pcie_wait_for_link() finds
the link up or when we handle the link-up interrupt.
Patches 1 and 2 would fix the link-up interrupt case. I think we need
another patch for the dwc core for dw_pcie_wait_for_link().
I wish I'd had time to spend on this and include patches 1 and 2, but
we're up against the merge window wire and I'll be out the end of this
week, so I think they'll have to wait. It seems like something we can
still justify for v6.16 though.
This also means I don't think we should need an NVMe quirk.
> If I understand Mani correctly, he thinks that we should queue up PATCH 1/4
> and PATCH 2/4 (although with modified commit messages).
>
> As you know, I do not have the (problematic) Plextor drive, so we go with
> the quirk option, then we would need to ask Laszlo nicely to retest.
> (And to provide the PCI device and PCI vendor ID of his NVMe device so we
> can write a quirk.)
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-06-03 18:12 ` Bjorn Helgaas
@ 2025-06-04 11:40 ` Niklas Cassel
2025-06-04 17:10 ` Manivannan Sadhasivam
0 siblings, 1 reply; 34+ messages in thread
From: Niklas Cassel @ 2025-06-04 11:40 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Heiko Stuebner, Wilfred Mallawa, Damien Le Moal, Hans Zhang,
Laszlo Fiat, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip
On Tue, Jun 03, 2025 at 01:12:50PM -0500, Bjorn Helgaas wrote:
>
> Hmmm, sorry, I misinterpreted both 1/4 and 2/4. I read them as "add
> this delay so the PLEXTOR device works", but in fact, I think in both
> cases, the delay is actually to enforce the PCIe r6.0, sec 6.6.1,
> requirement for software to wait 100ms before issuing a config
> request, and the fact that it makes PLEXTOR work is a side effect of
> that.
Well, the Plextor NVMe drive used to work with previous kernels,
but regressed.
But yes, the delay was added to enforce "PCIe r6.0, sec 6.6.1"
requirement for software to wait 100ms, which once again makes
the Plextor NVMe drive work.
>
> The beginning of that 100ms delay is "exit from Conventional Reset"
> (ports that support <= 5.0 GT/s) or "link training completes" (ports
> that support > 5.0 GT/s).
>
> I think we lack that 100ms delay in dwc drivers in general. The only
> generic dwc delay is in dw_pcie_host_init() via the LINK_WAIT_SLEEP_MS
> in dw_pcie_wait_for_link(), but that doesn't count because it's
> *before* the link comes up. We have to wait 100ms *after* exiting
> Conventional Reset or completing link training.
In dw_pcie_wait_for_link(), in the first iteration of the loop, the link
will never be up (because the link was just started),
dw_pcie_wait_for_link() will then sleep for LINK_WAIT_SLEEP_MS (90 ms),
before trying again.
Most likely the link training took way less than 100 ms, so most of those
90 ms will probably be after link training has completed.
That is most likely why Plextor worked on older kernels (which does not
use the link up IRQ).
If we add a 100 ms sleep after wait_for_link(), then I suggest that we
also reduce LINK_WAIT_SLEEP_MS to something shorter.
>
> We don't know when the exit from Conventional Reset was, but it was
> certainly before the link came up. In the absence of a timestamp for
> exit from reset, starting the wait after link-up is probably the best
> we can do. This could be either after dw_pcie_wait_for_link() finds
> the link up or when we handle the link-up interrupt.
>
> Patches 1 and 2 would fix the link-up interrupt case. I think we need
> another patch for the dwc core for dw_pcie_wait_for_link().
I agree, sounds like a plan.
>
> I wish I'd had time to spend on this and include patches 1 and 2, but
> we're up against the merge window wire and I'll be out the end of this
> week, so I think they'll have to wait. It seems like something we can
> still justify for v6.16 though.
I think it sounds good to target this as fixes for v6.16.
Do you plan to send out something after -rc1, or do you prefer me to do it?
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-06-04 11:40 ` Niklas Cassel
@ 2025-06-04 17:10 ` Manivannan Sadhasivam
2025-06-04 18:44 ` Bjorn Helgaas
0 siblings, 1 reply; 34+ messages in thread
From: Manivannan Sadhasivam @ 2025-06-04 17:10 UTC (permalink / raw)
To: Niklas Cassel
Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, Bjorn Helgaas, Heiko Stuebner, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Wed, Jun 04, 2025 at 01:40:52PM +0200, Niklas Cassel wrote:
> On Tue, Jun 03, 2025 at 01:12:50PM -0500, Bjorn Helgaas wrote:
> >
> > Hmmm, sorry, I misinterpreted both 1/4 and 2/4. I read them as "add
> > this delay so the PLEXTOR device works", but in fact, I think in both
> > cases, the delay is actually to enforce the PCIe r6.0, sec 6.6.1,
> > requirement for software to wait 100ms before issuing a config
> > request, and the fact that it makes PLEXTOR work is a side effect of
> > that.
>
> Well, the Plextor NVMe drive used to work with previous kernels,
> but regressed.
>
> But yes, the delay was added to enforce "PCIe r6.0, sec 6.6.1"
> requirement for software to wait 100ms, which once again makes
> the Plextor NVMe drive work.
>
>
> >
> > The beginning of that 100ms delay is "exit from Conventional Reset"
> > (ports that support <= 5.0 GT/s) or "link training completes" (ports
> > that support > 5.0 GT/s).
> >
> > I think we lack that 100ms delay in dwc drivers in general. The only
> > generic dwc delay is in dw_pcie_host_init() via the LINK_WAIT_SLEEP_MS
> > in dw_pcie_wait_for_link(), but that doesn't count because it's
> > *before* the link comes up. We have to wait 100ms *after* exiting
> > Conventional Reset or completing link training.
>
> In dw_pcie_wait_for_link(), in the first iteration of the loop, the link
> will never be up (because the link was just started),
> dw_pcie_wait_for_link() will then sleep for LINK_WAIT_SLEEP_MS (90 ms),
> before trying again.
>
> Most likely the link training took way less than 100 ms, so most of those
> 90 ms will probably be after link training has completed.
>
> That is most likely why Plextor worked on older kernels (which does not
> use the link up IRQ).
>
> If we add a 100 ms sleep after wait_for_link(), then I suggest that we
> also reduce LINK_WAIT_SLEEP_MS to something shorter.
>
No. The 900ms sleep is to make sure that we wait 1s before erroring out
assuming that the device is not present. This is mandated by the spec. So
irrespective of the delay we add *after* link up, we should try to detect the
link up for ~1s.
And for adding the delay, it should be done after the check for retry count:
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index b3615d125942..92eb661babeb 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -700,6 +700,8 @@ int dw_pcie_wait_for_link(struct dw_pcie *pci)
return -ETIMEDOUT;
}
+ msleep(PCIE_T_RRS_READY_MS);
+
offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
val = dw_pcie_readw_dbi(pci, offset + PCI_EXP_LNKSTA);
>
> >
> > We don't know when the exit from Conventional Reset was, but it was
> > certainly before the link came up. In the absence of a timestamp for
> > exit from reset, starting the wait after link-up is probably the best
> > we can do. This could be either after dw_pcie_wait_for_link() finds
> > the link up or when we handle the link-up interrupt.
> >
> > Patches 1 and 2 would fix the link-up interrupt case. I think we need
> > another patch for the dwc core for dw_pcie_wait_for_link().
>
> I agree, sounds like a plan.
>
>
> >
> > I wish I'd had time to spend on this and include patches 1 and 2, but
> > we're up against the merge window wire and I'll be out the end of this
> > week, so I think they'll have to wait. It seems like something we can
> > still justify for v6.16 though.
>
> I think it sounds good to target this as fixes for v6.16.
>
Yeah. Atleast patch 1 is fixing a regression, so it should be included for
v6.16.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply related [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-06-04 17:10 ` Manivannan Sadhasivam
@ 2025-06-04 18:44 ` Bjorn Helgaas
2025-06-05 12:28 ` Niklas Cassel
0 siblings, 1 reply; 34+ messages in thread
From: Bjorn Helgaas @ 2025-06-04 18:44 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Niklas Cassel, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, Bjorn Helgaas, Heiko Stuebner, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Wed, Jun 04, 2025 at 10:40:09PM +0530, Manivannan Sadhasivam wrote:
> On Wed, Jun 04, 2025 at 01:40:52PM +0200, Niklas Cassel wrote:
> > On Tue, Jun 03, 2025 at 01:12:50PM -0500, Bjorn Helgaas wrote:
> > >
> > > Hmmm, sorry, I misinterpreted both 1/4 and 2/4. I read them as "add
> > > this delay so the PLEXTOR device works", but in fact, I think in both
> > > cases, the delay is actually to enforce the PCIe r6.0, sec 6.6.1,
> > > requirement for software to wait 100ms before issuing a config
> > > request, and the fact that it makes PLEXTOR work is a side effect of
> > > that.
> >
> > Well, the Plextor NVMe drive used to work with previous kernels,
> > but regressed.
> >
> > But yes, the delay was added to enforce "PCIe r6.0, sec 6.6.1"
> > requirement for software to wait 100ms, which once again makes
> > the Plextor NVMe drive work.
> >
> > > The beginning of that 100ms delay is "exit from Conventional Reset"
> > > (ports that support <= 5.0 GT/s) or "link training completes" (ports
> > > that support > 5.0 GT/s).
> > >
> > > I think we lack that 100ms delay in dwc drivers in general. The only
> > > generic dwc delay is in dw_pcie_host_init() via the LINK_WAIT_SLEEP_MS
> > > in dw_pcie_wait_for_link(), but that doesn't count because it's
> > > *before* the link comes up. We have to wait 100ms *after* exiting
> > > Conventional Reset or completing link training.
> >
> > In dw_pcie_wait_for_link(), in the first iteration of the loop, the link
> > will never be up (because the link was just started),
> > dw_pcie_wait_for_link() will then sleep for LINK_WAIT_SLEEP_MS (90 ms),
> > before trying again.
> >
> > Most likely the link training took way less than 100 ms, so most of those
> > 90 ms will probably be after link training has completed.
> >
> > That is most likely why Plextor worked on older kernels (which does not
> > use the link up IRQ).
Definitely seems plausible.
> > If we add a 100 ms sleep after wait_for_link(), then I suggest that we
> > also reduce LINK_WAIT_SLEEP_MS to something shorter.
>
> No. The 900ms sleep is to make sure that we wait 1s before erroring out
> assuming that the device is not present. This is mandated by the spec. So
> irrespective of the delay we add *after* link up, we should try to detect the
> link up for ~1s.
I think it would be sensible for dw_pcie_wait_for_link() to check for
link up more frequently, i.e., reduce LINK_WAIT_SLEEP_MS and increase
LINK_WAIT_MAX_RETRIES.
If LINK_WAIT_SLEEP_MS * LINK_WAIT_MAX_RETRIES is for the 1.0s
mentioned in sec 6.6.1, seems like maybe we should make a generic
#define for it so we could include the spec reference and use it
across all drivers. And resolve the question of 900ms vs 1000ms.
Bjorn
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-06-04 18:44 ` Bjorn Helgaas
@ 2025-06-05 12:28 ` Niklas Cassel
2025-06-05 13:22 ` Manivannan Sadhasivam
2025-06-05 19:27 ` Bjorn Helgaas
0 siblings, 2 replies; 34+ messages in thread
From: Niklas Cassel @ 2025-06-05 12:28 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Heiko Stuebner, Wilfred Mallawa, Damien Le Moal, Hans Zhang,
Laszlo Fiat, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip
On Wed, Jun 04, 2025 at 01:44:45PM -0500, Bjorn Helgaas wrote:
> On Wed, Jun 04, 2025 at 10:40:09PM +0530, Manivannan Sadhasivam wrote:
>
> > > If we add a 100 ms sleep after wait_for_link(), then I suggest that we
> > > also reduce LINK_WAIT_SLEEP_MS to something shorter.
> >
> > No. The 900ms sleep is to make sure that we wait 1s before erroring out
> > assuming that the device is not present. This is mandated by the spec. So
> > irrespective of the delay we add *after* link up, we should try to detect the
> > link up for ~1s.
>
> I think it would be sensible for dw_pcie_wait_for_link() to check for
> link up more frequently, i.e., reduce LINK_WAIT_SLEEP_MS and increase
> LINK_WAIT_MAX_RETRIES.
>
> If LINK_WAIT_SLEEP_MS * LINK_WAIT_MAX_RETRIES is for the 1.0s
> mentioned in sec 6.6.1, seems like maybe we should make a generic
> #define for it so we could include the spec reference and use it
> across all drivers. And resolve the question of 900ms vs 1000ms.
Like Bjorn mentioned, when I wrote reduce LINK_WAIT_SLEEP_MS,
I simply meant that we should poll for link up more frequently.
But yes, if we reduce LINK_WAIT_SLEEP_MS we should bump
LINK_WAIT_MAX_RETRIES to not change the current max wait time.
Bjorn, should I send something out after -rc1, or did you want
to work on this yourself?
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-06-05 12:28 ` Niklas Cassel
@ 2025-06-05 13:22 ` Manivannan Sadhasivam
2025-06-05 19:27 ` Bjorn Helgaas
1 sibling, 0 replies; 34+ messages in thread
From: Manivannan Sadhasivam @ 2025-06-05 13:22 UTC (permalink / raw)
To: Niklas Cassel
Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring, Bjorn Helgaas, Heiko Stuebner, Wilfred Mallawa,
Damien Le Moal, Hans Zhang, Laszlo Fiat,
Krzysztof Wilczyński, linux-pci, linux-arm-kernel,
linux-rockchip
On Thu, Jun 05, 2025 at 02:28:41PM +0200, Niklas Cassel wrote:
> On Wed, Jun 04, 2025 at 01:44:45PM -0500, Bjorn Helgaas wrote:
> > On Wed, Jun 04, 2025 at 10:40:09PM +0530, Manivannan Sadhasivam wrote:
> >
> > > > If we add a 100 ms sleep after wait_for_link(), then I suggest that we
> > > > also reduce LINK_WAIT_SLEEP_MS to something shorter.
> > >
> > > No. The 900ms sleep is to make sure that we wait 1s before erroring out
> > > assuming that the device is not present. This is mandated by the spec. So
> > > irrespective of the delay we add *after* link up, we should try to detect the
> > > link up for ~1s.
> >
> > I think it would be sensible for dw_pcie_wait_for_link() to check for
> > link up more frequently, i.e., reduce LINK_WAIT_SLEEP_MS and increase
> > LINK_WAIT_MAX_RETRIES.
> >
> > If LINK_WAIT_SLEEP_MS * LINK_WAIT_MAX_RETRIES is for the 1.0s
> > mentioned in sec 6.6.1, seems like maybe we should make a generic
> > #define for it so we could include the spec reference and use it
> > across all drivers. And resolve the question of 900ms vs 1000ms.
>
> Like Bjorn mentioned, when I wrote reduce LINK_WAIT_SLEEP_MS,
> I simply meant that we should poll for link up more frequently.
>
Sorry, I couldn't decipher it.
> But yes, if we reduce LINK_WAIT_SLEEP_MS we should bump
> LINK_WAIT_MAX_RETRIES to not change the current max wait time.
>
I agree. And I like the idea of having a generic define which Bjorn suggested
for the wait delay.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready
2025-06-05 12:28 ` Niklas Cassel
2025-06-05 13:22 ` Manivannan Sadhasivam
@ 2025-06-05 19:27 ` Bjorn Helgaas
1 sibling, 0 replies; 34+ messages in thread
From: Bjorn Helgaas @ 2025-06-05 19:27 UTC (permalink / raw)
To: Niklas Cassel
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi,
Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas,
Heiko Stuebner, Wilfred Mallawa, Damien Le Moal, Hans Zhang,
Laszlo Fiat, Krzysztof Wilczyński, linux-pci,
linux-arm-kernel, linux-rockchip
On Thu, Jun 05, 2025 at 02:28:41PM +0200, Niklas Cassel wrote:
> On Wed, Jun 04, 2025 at 01:44:45PM -0500, Bjorn Helgaas wrote:
> > On Wed, Jun 04, 2025 at 10:40:09PM +0530, Manivannan Sadhasivam wrote:
> >
> > > > If we add a 100 ms sleep after wait_for_link(), then I suggest that we
> > > > also reduce LINK_WAIT_SLEEP_MS to something shorter.
> > >
> > > No. The 900ms sleep is to make sure that we wait 1s before erroring out
> > > assuming that the device is not present. This is mandated by the spec. So
> > > irrespective of the delay we add *after* link up, we should try to detect the
> > > link up for ~1s.
> >
> > I think it would be sensible for dw_pcie_wait_for_link() to check for
> > link up more frequently, i.e., reduce LINK_WAIT_SLEEP_MS and increase
> > LINK_WAIT_MAX_RETRIES.
> >
> > If LINK_WAIT_SLEEP_MS * LINK_WAIT_MAX_RETRIES is for the 1.0s
> > mentioned in sec 6.6.1, seems like maybe we should make a generic
> > #define for it so we could include the spec reference and use it
> > across all drivers. And resolve the question of 900ms vs 1000ms.
>
> Like Bjorn mentioned, when I wrote reduce LINK_WAIT_SLEEP_MS,
> I simply meant that we should poll for link up more frequently.
>
> But yes, if we reduce LINK_WAIT_SLEEP_MS we should bump
> LINK_WAIT_MAX_RETRIES to not change the current max wait time.
>
>
> Bjorn, should I send something out after -rc1, or did you want
> to work on this yourself?
Yes, please post something after -rc1. Given the number of drivers
and the much smaller number of msleep() calls, I suspect lots of
drivers have similar problems.
Bjorn
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2025-06-05 19:30 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-06 7:39 [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
2025-05-06 7:39 ` [PATCH v2 1/4] PCI: dw-rockchip: Do not enumerate bus before endpoint devices are ready Niklas Cassel
2025-05-06 11:32 ` Laszlo Fiat
2025-05-06 22:23 ` Wilfred Mallawa
2025-05-28 22:42 ` Bjorn Helgaas
2025-05-30 13:57 ` Niklas Cassel
2025-05-30 15:21 ` Bjorn Helgaas
2025-05-30 15:59 ` Manivannan Sadhasivam
2025-05-30 17:19 ` Bjorn Helgaas
2025-05-30 17:24 ` Niklas Cassel
2025-05-30 19:43 ` Bjorn Helgaas
2025-05-31 6:47 ` Manivannan Sadhasivam
2025-06-03 14:08 ` Niklas Cassel
2025-06-03 18:12 ` Bjorn Helgaas
2025-06-04 11:40 ` Niklas Cassel
2025-06-04 17:10 ` Manivannan Sadhasivam
2025-06-04 18:44 ` Bjorn Helgaas
2025-06-05 12:28 ` Niklas Cassel
2025-06-05 13:22 ` Manivannan Sadhasivam
2025-06-05 19:27 ` Bjorn Helgaas
2025-05-06 7:39 ` [PATCH v2 3/4] PCI: dw-rockchip: Replace PERST sleep time with proper macro Niklas Cassel
2025-05-06 9:07 ` Hans Zhang
2025-05-06 11:36 ` Laszlo Fiat
2025-05-06 22:24 ` Wilfred Mallawa
2025-05-06 14:33 ` [PATCH v2 0/4] PCI: dwc: Link Up IRQ fixes Niklas Cassel
2025-05-06 14:51 ` Laszlo Fiat
2025-05-13 10:53 ` Manivannan Sadhasivam
2025-05-13 14:07 ` Niklas Cassel
2025-05-15 17:33 ` Laszlo Fiat
2025-05-16 10:00 ` Niklas Cassel
2025-05-16 18:48 ` Laszlo Fiat
2025-05-19 9:44 ` Niklas Cassel
2025-05-19 12:10 ` Manivannan Sadhasivam
2025-05-19 12:37 ` Manivannan Sadhasivam
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).