Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Niklas Cassel <cassel@kernel.org>
To: Manivannan Sadhasivam <mani@kernel.org>
Cc: "Bjorn Helgaas" <helgaas@kernel.org>,
	"Jingoo Han" <jingoohan1@gmail.com>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	"Rob Herring" <robh@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Wilfred Mallawa" <wilfred.mallawa@wdc.com>,
	"Damien Le Moal" <dlemoal@kernel.org>,
	"Laszlo Fiat" <laszlo.fiat@proton.me>,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH v4 5/7] PCI: dwc: Ensure that dw_pcie_wait_for_link() waits 100 ms after link up
Date: Wed, 2 Jul 2025 18:17:41 +0200	[thread overview]
Message-ID: <aGVbpTZZmYyKIffk@ryzen> (raw)
In-Reply-To: <hhyxhxvohmeqzjdu3amabcpf3e4ufi4ps5xd2uia4ea6i2u5oj@sxyjavqyqc7m>

Hello Mani,

On Wed, Jul 02, 2025 at 08:17:44PM +0530, Manivannan Sadhasivam wrote:
> On Wed, Jul 02, 2025 at 11:43:11AM GMT, Niklas Cassel wrote:
> > However, my main concern is not that qcom waits twice, it is those drivers
> > that do not have a 100 ms delay after PERST# has been deasserted, because
> > before commit 470f10f18b48 ("PCI: Reduce PCIE_LINK_WAIT_SLEEP_MS"), those
> > drivers might have been "saved" by the ridiculously long
> > PCIE_LINK_WAIT_SLEEP_MS.
> > 
> > However, now when we sleep less in each iteration when polling for link up,
> > those drivers that do not have a 100 ms delay after PERST# has been
> > deasserted might actually see regressions, because (the now reduced)
> > PCIE_LINK_WAIT_SLEEP_MS time is no longer "saving" them.
>
> 
> Why can't you just add the delay to those drivers now? They can be consolidated
> later.

Right now, I don't have the extra cycles needed to read all of these
drivers:

$ git grep -l gpio drivers/pci/controller/dwc/
drivers/pci/controller/dwc/pci-dra7xx.c
drivers/pci/controller/dwc/pci-imx6.c
drivers/pci/controller/dwc/pci-keystone.c
drivers/pci/controller/dwc/pci-meson.c
drivers/pci/controller/dwc/pcie-amd-mdb.c
drivers/pci/controller/dwc/pcie-bt1.c
drivers/pci/controller/dwc/pcie-designware-plat.c
drivers/pci/controller/dwc/pcie-designware.c
drivers/pci/controller/dwc/pcie-designware.h
drivers/pci/controller/dwc/pcie-dw-rockchip.c
drivers/pci/controller/dwc/pcie-fu740.c
drivers/pci/controller/dwc/pcie-histb.c
drivers/pci/controller/dwc/pcie-intel-gw.c
drivers/pci/controller/dwc/pcie-keembay.c
drivers/pci/controller/dwc/pcie-kirin.c
drivers/pci/controller/dwc/pcie-qcom-ep.c
drivers/pci/controller/dwc/pcie-qcom.c
drivers/pci/controller/dwc/pcie-rcar-gen4.c
drivers/pci/controller/dwc/pcie-tegra194.c
drivers/pci/controller/dwc/pcie-visconti.c

then understanding them properly to feel that I am confident in my changes,
waiting for reviews from each glue driver maintainer, and then waiting for
someone to pick it up.

Also, all of the above has to be done when consolidating the PERST#
handling anyway.



This whole thing started because someone reported a regression on a random
Plextor NVMe drive. Since it was my commit ec9fd499b9c6 ("PCI: dw-rockchip:
Don't wait for link since we can detect Link Up") that introduced the
regression, I obviously helped debugging the issue.

The regression was solved by adding a 100 ms delay after receiving the link
up IRQ, before enumerating the bus.

This fix was sent May 5th:
https://lore.kernel.org/linux-pci/20250505092603.286623-7-cassel@kernel.org/
(This series had up to v2, the series was then renamed and had up to v4,
so basically v6 in total.)

While my responsibility was done two months ago, I still wanted to make
sure that no one else got bit by the same bug, thus I also improved the
generic dw_pcie_wait_for_link() (for those drivers not using link up IRQ):
https://lore.kernel.org/linux-pci/20250625102347.1205584-14-cassel@kernel.org/

Sure, that will only help PCIe controllers that support Link speeds greater
than 5.0 GT/s, and do not use a link up IRQ, but still something.


PCIe controllers that only support Link speeds <= 5.0 GT/s, and do not use
a link up IRQ, and do not have a delay after deasserting PERST#, can still
send config requests too early.

However, if we drop 470f10f18b48 ("PCI: Reduce PCIE_LINK_WAIT_SLEEP_MS"),
PCIe controllers that only support Link speeds <= 5.0 GT/s, and do not use
a link up IRQ, and do not have a delay after deasserting PERST#, will be no
worse off than what is already in mainline.

PCIe controllers that support Link speeds greater than 5.0 GT/s, and do not
use a link up IRQ, will still be more robust than ever :)
(Rome wasn't built in a day.)


Kind regards,
Niklas

  reply	other threads:[~2025-07-02 16:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-25 10:23 [PATCH v4 0/7] PCI: dwc: Do not enumerate bus before endpoint devices are ready Niklas Cassel
2025-06-25 10:23 ` [PATCH v4 1/7] PCI: Rename PCIE_RESET_CONFIG_DEVICE_WAIT_MS to PCIE_RESET_CONFIG_WAIT_MS Niklas Cassel
2025-06-25 10:23 ` [PATCH v4 2/7] PCI: rockchip-host: Use macro PCIE_RESET_CONFIG_WAIT_MS Niklas Cassel
2025-06-25 10:23 ` [PATCH v4 3/7] PCI: dw-rockchip: Wait PCIE_RESET_CONFIG_WAIT_MS after link-up IRQ Niklas Cassel
2025-06-25 10:23 ` [PATCH v4 4/7] PCI: qcom: " Niklas Cassel
2025-06-25 10:23 ` [PATCH v4 5/7] PCI: dwc: Ensure that dw_pcie_wait_for_link() waits 100 ms after link up Niklas Cassel
2025-06-30 20:19   ` Bjorn Helgaas
2025-07-01 11:55     ` Niklas Cassel
2025-07-01 13:01       ` Manivannan Sadhasivam
2025-07-01 16:38         ` Bjorn Helgaas
2025-07-02  9:43           ` Niklas Cassel
2025-07-02 14:47             ` Manivannan Sadhasivam
2025-07-02 16:17               ` Niklas Cassel [this message]
2025-07-07  7:48                 ` Manivannan Sadhasivam
2025-07-07 11:56                   ` Niklas Cassel
2025-07-08  7:49                     ` Manivannan Sadhasivam
2025-07-07 23:01               ` Bjorn Helgaas
2025-06-25 10:23 ` [PATCH v4 6/7] PCI: Move link up wait time and max retries macros to pci.h Niklas Cassel
2025-06-25 10:23 ` [PATCH v4 7/7] PCI: Reduce PCIE_LINK_WAIT_SLEEP_MS Niklas Cassel
2025-06-25 13:29 ` [PATCH v4 0/7] PCI: dwc: Do not enumerate bus before endpoint devices are ready Manivannan Sadhasivam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aGVbpTZZmYyKIffk@ryzen \
    --to=cassel@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=dlemoal@kernel.org \
    --cc=helgaas@kernel.org \
    --cc=jingoohan1@gmail.com \
    --cc=kwilczynski@kernel.org \
    --cc=laszlo.fiat@proton.me \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mani@kernel.org \
    --cc=robh@kernel.org \
    --cc=wilfred.mallawa@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox