From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
oohall@gmail.com, Chris Chiu <chris.chiu@canonical.com>,
Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@linux.intel.com>,
Ashok Raj <ashok.raj@intel.com>,
Sheng Bi <windy.bi.enflame@gmail.com>,
Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>,
Stanislav Spassov <stanspas@amazon.de>,
Yang Su <yang.su@linux.alibaba.com>,
shuo.tan@linux.alibaba.com, linux-pci@vger.kernel.org
Subject: Re: [PATCH v3] PCI/PM: Bail out early in pci_bridge_wait_for_secondary_bus() if link is not trained
Date: Fri, 14 Apr 2023 13:11:47 +0300 [thread overview]
Message-ID: <20230414101147.GA66750@black.fi.intel.com> (raw)
In-Reply-To: <20230414074238.GA22973@wunner.de>
Hi,
On Fri, Apr 14, 2023 at 09:42:38AM +0200, Lukas Wunner wrote:
> On Thu, Apr 13, 2023 at 01:16:42PM +0300, Mika Westerberg wrote:
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5037,6 +5037,22 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
> > }
> > }
> >
> > + /*
> > + * Everything above is handling the delays mandated by the PCIe r6.0
> > + * sec 6.6.1.
> > + *
> > + * If the port supports active link reporting we now check one more
> > + * time if the link is active and if not bail out early with the
> > + * assumption that the device is not present anymore.
> > + */
> > + if (dev->link_active_reporting) {
> > + u16 status;
> > +
> > + pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
> > + if (!(status & PCI_EXP_LNKSTA_DLLLA))
> > + return -ENOTTY;
> > + }
> > +
> > return pci_dev_wait(child, reset_type,
> > PCIE_RESET_READY_POLL_MS - delay);
> > }
>
> Hm, shouldn't the added code live in the
>
> if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT)
>
> branch? For the else branch (Gen3+ devices with > 5 GT/s),
> we've already waited for the link to become active, so the
> additional check seems superfluous. (But maybe I'm missing
> something.)
You are not missing anything ;-) Indeed it should belong there, and I
think we are also missing now the "optimization" for devices behind slow
link without active link reporting capabilities. That is we wait for the
1s instead of the whole 60s.
> I also note that this documentation change has been dropped
> vis-à-vis v1 of the patch, not sure if that's intentional:
>
> - * However, 100 ms is the minimum and the PCIe spec says the
> - * software must allow at least 1s before it can determine that the
> - * device that did not respond is a broken device. There is
> - * evidence that 100 ms is not always enough, for example certain
> - * Titan Ridge xHCI controller does not always respond to
> - * configuration requests if we only wait for 100 ms (see
> - * https://bugzilla.kernel.org/show_bug.cgi?id=203885).
> + * However, 100 ms is the minimum and the PCIe spec says the software
> + * must allow at least 1s before it can determine that the device that
> + * did not respond is a broken device. Also device can take longer than
> + * that to respond if it indicates so through Request Retry Status
> + * completions.
This is not intentional. I will add it back in the next version.
To summarize the v4 patch would look something like below. Only compile
tested but I will run real testing later today. I think it now includes
the 1s optimization and also checking of the active link reporting
support for the devices behind slow links. Let me know is I missed
something.
It is getting rather complex unfortunately :(
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 61bf8a4b2099..f81a9e6aff84 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -64,6 +64,13 @@ struct pci_pme_device {
#define PME_TIMEOUT 1000 /* How long between PME checks */
+/*
+ * Following exit from Conventional Reset, devices must be ready within 1 sec
+ * (PCIe r6.0 sec 6.6.1). A D3cold to D0 transition implies a Conventional
+ * Reset (PCIe r6.0 sec 5.8).
+ */
+#define PCI_RESET_WAIT 1000 /* msec */
+
/*
* Devices may extend the 1 sec period through Request Retry Status
* completions (PCIe r6.0 sec 2.3.1). The spec does not provide an upper
@@ -5010,13 +5017,11 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
* speeds (gen3) we need to wait first for the data link layer to
* become active.
*
- * However, 100 ms is the minimum and the PCIe spec says the
- * software must allow at least 1s before it can determine that the
- * device that did not respond is a broken device. There is
- * evidence that 100 ms is not always enough, for example certain
- * Titan Ridge xHCI controller does not always respond to
- * configuration requests if we only wait for 100 ms (see
- * https://bugzilla.kernel.org/show_bug.cgi?id=203885).
+ * However, 100 ms is the minimum and the PCIe spec says the software
+ * must allow at least 1s before it can determine that the device that
+ * did not respond is a broken device. Also device can take longer than
+ * that to respond if it indicates so through Request Retry Status
+ * completions.
*
* Therefore we wait for 100 ms and check for the device presence
* until the timeout expires.
@@ -5027,30 +5032,29 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
msleep(delay);
- } else {
- pci_dbg(dev, "waiting %d ms for downstream link, after activation\n",
- delay);
- if (!pcie_wait_for_link_delay(dev, true, delay)) {
- /* Did not train, no need to wait any further */
- pci_info(dev, "Data Link Layer Link Active not set in 1000 msec\n");
- return -ENOTTY;
+
+ /*
+ * If the port supports active link reporting we now check one
+ * more time if the link is active and if not bail out early
+ * with the assumption that the device is not present anymore.
+ */
+ if (dev->link_active_reporting) {
+ u16 status;
+
+ pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
+ if (!(status & PCI_EXP_LNKSTA_DLLLA))
+ return -ENOTTY;
}
- }
- /*
- * Everything above is handling the delays mandated by the PCIe r6.0
- * sec 6.6.1.
- *
- * If the port supports active link reporting we now check one more
- * time if the link is active and if not bail out early with the
- * assumption that the device is not present anymore.
- */
- if (dev->link_active_reporting) {
- u16 status;
+ return pci_dev_wait(child, reset_type, PCI_RESET_WAIT - delay);
+ }
- pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
- if (!(status & PCI_EXP_LNKSTA_DLLLA))
- return -ENOTTY;
+ pci_dbg(dev, "waiting %d ms for downstream link, after activation\n",
+ delay);
+ if (!pcie_wait_for_link_delay(dev, true, delay)) {
+ /* Did not train, no need to wait any further */
+ pci_info(dev, "Data Link Layer Link Active not set in 1000 msec\n");
+ return -ENOTTY;
}
return pci_dev_wait(child, reset_type,
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 022da58afb33..f2d3aeab91f4 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -64,13 +64,6 @@ struct pci_cap_saved_state *pci_find_saved_ext_cap(struct pci_dev *dev,
#define PCI_PM_D3HOT_WAIT 10 /* msec */
#define PCI_PM_D3COLD_WAIT 100 /* msec */
-/*
- * Following exit from Conventional Reset, devices must be ready within 1 sec
- * (PCIe r6.0 sec 6.6.1). A D3cold to D0 transition implies a Conventional
- * Reset (PCIe r6.0 sec 5.8).
- */
-#define PCI_RESET_WAIT 1000 /* msec */
-
void pci_update_current_state(struct pci_dev *dev, pci_power_t state);
void pci_refresh_power_state(struct pci_dev *dev);
int pci_power_up(struct pci_dev *dev);
next prev parent reply other threads:[~2023-04-14 10:11 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-13 10:16 [PATCH v3] PCI/PM: Bail out early in pci_bridge_wait_for_secondary_bus() if link is not trained Mika Westerberg
2023-04-13 14:16 ` Sathyanarayanan Kuppuswamy
2023-04-14 7:42 ` Lukas Wunner
2023-04-14 10:11 ` Mika Westerberg [this message]
2023-04-16 7:48 ` Lukas Wunner
2023-04-17 6:07 ` Mika Westerberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230414101147.GA66750@black.fi.intel.com \
--to=mika.westerberg@linux.intel.com \
--cc=ashok.raj@intel.com \
--cc=bhelgaas@google.com \
--cc=chris.chiu@canonical.com \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=oohall@gmail.com \
--cc=ravi.kishore.koppuravuri@intel.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=shuo.tan@linux.alibaba.com \
--cc=stanspas@amazon.de \
--cc=windy.bi.enflame@gmail.com \
--cc=yang.su@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox