From: Mika Westerberg <mika.westerberg@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
oohall@gmail.com, Chris Chiu <chris.chiu@canonical.com>,
Sathyanarayanan Kuppuswamy
<sathyanarayanan.kuppuswamy@linux.intel.com>,
Ashok Raj <ashok.raj@intel.com>,
Sheng Bi <windy.bi.enflame@gmail.com>,
Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com>,
Stanislav Spassov <stanspas@amazon.de>,
Yang Su <yang.su@linux.alibaba.com>,
shuo.tan@linux.alibaba.com, linux-pci@vger.kernel.org
Subject: Re: [PATCH v3] PCI/PM: Bail out early in pci_bridge_wait_for_secondary_bus() if link is not trained
Date: Fri, 14 Apr 2023 13:11:47 +0300 [thread overview]
Message-ID: <20230414101147.GA66750@black.fi.intel.com> (raw)
In-Reply-To: <20230414074238.GA22973@wunner.de>
Hi,
On Fri, Apr 14, 2023 at 09:42:38AM +0200, Lukas Wunner wrote:
> On Thu, Apr 13, 2023 at 01:16:42PM +0300, Mika Westerberg wrote:
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -5037,6 +5037,22 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
> > }
> > }
> >
> > + /*
> > + * Everything above is handling the delays mandated by the PCIe r6.0
> > + * sec 6.6.1.
> > + *
> > + * If the port supports active link reporting we now check one more
> > + * time if the link is active and if not bail out early with the
> > + * assumption that the device is not present anymore.
> > + */
> > + if (dev->link_active_reporting) {
> > + u16 status;
> > +
> > + pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
> > + if (!(status & PCI_EXP_LNKSTA_DLLLA))
> > + return -ENOTTY;
> > + }
> > +
> > return pci_dev_wait(child, reset_type,
> > PCIE_RESET_READY_POLL_MS - delay);
> > }
>
> Hm, shouldn't the added code live in the
>
> if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT)
>
> branch? For the else branch (Gen3+ devices with > 5 GT/s),
> we've already waited for the link to become active, so the
> additional check seems superfluous. (But maybe I'm missing
> something.)
You are not missing anything ;-) Indeed it should belong there, and I
think we are also missing now the "optimization" for devices behind slow
link without active link reporting capabilities. That is we wait for the
1s instead of the whole 60s.
> I also note that this documentation change has been dropped
> vis-à-vis v1 of the patch, not sure if that's intentional:
>
> - * However, 100 ms is the minimum and the PCIe spec says the
> - * software must allow at least 1s before it can determine that the
> - * device that did not respond is a broken device. There is
> - * evidence that 100 ms is not always enough, for example certain
> - * Titan Ridge xHCI controller does not always respond to
> - * configuration requests if we only wait for 100 ms (see
> - * https://bugzilla.kernel.org/show_bug.cgi?id=203885).
> + * However, 100 ms is the minimum and the PCIe spec says the software
> + * must allow at least 1s before it can determine that the device that
> + * did not respond is a broken device. Also device can take longer than
> + * that to respond if it indicates so through Request Retry Status
> + * completions.
This is not intentional. I will add it back in the next version.
To summarize the v4 patch would look something like below. Only compile
tested but I will run real testing later today. I think it now includes
the 1s optimization and also checking of the active link reporting
support for the devices behind slow links. Let me know is I missed
something.
It is getting rather complex unfortunately :(
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 61bf8a4b2099..f81a9e6aff84 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -64,6 +64,13 @@ struct pci_pme_device {
#define PME_TIMEOUT 1000 /* How long between PME checks */
+/*
+ * Following exit from Conventional Reset, devices must be ready within 1 sec
+ * (PCIe r6.0 sec 6.6.1). A D3cold to D0 transition implies a Conventional
+ * Reset (PCIe r6.0 sec 5.8).
+ */
+#define PCI_RESET_WAIT 1000 /* msec */
+
/*
* Devices may extend the 1 sec period through Request Retry Status
* completions (PCIe r6.0 sec 2.3.1). The spec does not provide an upper
@@ -5010,13 +5017,11 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
* speeds (gen3) we need to wait first for the data link layer to
* become active.
*
- * However, 100 ms is the minimum and the PCIe spec says the
- * software must allow at least 1s before it can determine that the
- * device that did not respond is a broken device. There is
- * evidence that 100 ms is not always enough, for example certain
- * Titan Ridge xHCI controller does not always respond to
- * configuration requests if we only wait for 100 ms (see
- * https://bugzilla.kernel.org/show_bug.cgi?id=203885).
+ * However, 100 ms is the minimum and the PCIe spec says the software
+ * must allow at least 1s before it can determine that the device that
+ * did not respond is a broken device. Also device can take longer than
+ * that to respond if it indicates so through Request Retry Status
+ * completions.
*
* Therefore we wait for 100 ms and check for the device presence
* until the timeout expires.
@@ -5027,30 +5032,29 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
msleep(delay);
- } else {
- pci_dbg(dev, "waiting %d ms for downstream link, after activation\n",
- delay);
- if (!pcie_wait_for_link_delay(dev, true, delay)) {
- /* Did not train, no need to wait any further */
- pci_info(dev, "Data Link Layer Link Active not set in 1000 msec\n");
- return -ENOTTY;
+
+ /*
+ * If the port supports active link reporting we now check one
+ * more time if the link is active and if not bail out early
+ * with the assumption that the device is not present anymore.
+ */
+ if (dev->link_active_reporting) {
+ u16 status;
+
+ pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
+ if (!(status & PCI_EXP_LNKSTA_DLLLA))
+ return -ENOTTY;
}
- }
- /*
- * Everything above is handling the delays mandated by the PCIe r6.0
- * sec 6.6.1.
- *
- * If the port supports active link reporting we now check one more
- * time if the link is active and if not bail out early with the
- * assumption that the device is not present anymore.
- */
- if (dev->link_active_reporting) {
- u16 status;
+ return pci_dev_wait(child, reset_type, PCI_RESET_WAIT - delay);
+ }
- pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
- if (!(status & PCI_EXP_LNKSTA_DLLLA))
- return -ENOTTY;
+ pci_dbg(dev, "waiting %d ms for downstream link, after activation\n",
+ delay);
+ if (!pcie_wait_for_link_delay(dev, true, delay)) {
+ /* Did not train, no need to wait any further */
+ pci_info(dev, "Data Link Layer Link Active not set in 1000 msec\n");
+ return -ENOTTY;
}
return pci_dev_wait(child, reset_type,
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 022da58afb33..f2d3aeab91f4 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -64,13 +64,6 @@ struct pci_cap_saved_state *pci_find_saved_ext_cap(struct pci_dev *dev,
#define PCI_PM_D3HOT_WAIT 10 /* msec */
#define PCI_PM_D3COLD_WAIT 100 /* msec */
-/*
- * Following exit from Conventional Reset, devices must be ready within 1 sec
- * (PCIe r6.0 sec 6.6.1). A D3cold to D0 transition implies a Conventional
- * Reset (PCIe r6.0 sec 5.8).
- */
-#define PCI_RESET_WAIT 1000 /* msec */
-
void pci_update_current_state(struct pci_dev *dev, pci_power_t state);
void pci_refresh_power_state(struct pci_dev *dev);
int pci_power_up(struct pci_dev *dev);
next prev parent reply other threads:[~2023-04-14 10:11 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-13 10:16 [PATCH v3] PCI/PM: Bail out early in pci_bridge_wait_for_secondary_bus() if link is not trained Mika Westerberg
2023-04-13 14:16 ` Sathyanarayanan Kuppuswamy
2023-04-14 7:42 ` Lukas Wunner
2023-04-14 10:11 ` Mika Westerberg [this message]
2023-04-16 7:48 ` Lukas Wunner
2023-04-17 6:07 ` Mika Westerberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230414101147.GA66750@black.fi.intel.com \
--to=mika.westerberg@linux.intel.com \
--cc=ashok.raj@intel.com \
--cc=bhelgaas@google.com \
--cc=chris.chiu@canonical.com \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=oohall@gmail.com \
--cc=ravi.kishore.koppuravuri@intel.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=shuo.tan@linux.alibaba.com \
--cc=stanspas@amazon.de \
--cc=windy.bi.enflame@gmail.com \
--cc=yang.su@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.