public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready
@ 2024-08-27 23:48 Bjorn Helgaas
  2024-08-27 23:48 ` [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS Bjorn Helgaas
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2024-08-27 23:48 UTC (permalink / raw)
  To: linux-pci
  Cc: Ilpo Järvinen, Maciej W . Rozycki, Mika Westerberg,
	Lukas Wunner, Rafael J . Wysocki, Mario Limonciello, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas

From: Bjorn Helgaas <bhelgaas@google.com>

After a device reset, pci_dev_wait() waits for a device to become
completely ready by polling the PCI_COMMAND register.  The spec envisions
that software would instead poll for the device to stop responding to
config reads with Completions with Request Retry Status (RRS).

Polling PCI_COMMAND leads to hardware retries that are invisible to
software and the backoff between software retries doesn't work correctly.

Root Ports are not required to support the Configuration RRS Software
Visibility feature that prevents hardware retries and makes the RRS
Completions visible to software, so this series only uses it when available
and falls back to PCI_COMMAND polling when it's not.

This is completely untested and posted for comments.

Bjorn Helgaas (3):
  PCI: Wait for device readiness with Configuration RRS
  PCI: aardvark: Correct Configuration RRS checking
  PCI: Rename CRS Completion Status to RRS

 drivers/bcma/driver_pci_host.c             | 10 ++--
 drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
 drivers/pci/controller/pci-aardvark.c      | 64 +++++++++++-----------
 drivers/pci/controller/pci-xgene.c         |  6 +-
 drivers/pci/controller/pcie-iproc.c        | 18 +++---
 drivers/pci/pci-bridge-emul.c              |  4 +-
 drivers/pci/pci.c                          | 41 +++++++++-----
 drivers/pci/pci.h                          | 11 +++-
 drivers/pci/probe.c                        | 33 +++++------
 include/linux/bcma/bcma_driver_pci.h       |  2 +-
 include/linux/pci.h                        |  1 +
 include/uapi/linux/pci_regs.h              |  6 +-
 12 files changed, 117 insertions(+), 97 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS
  2024-08-27 23:48 [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Bjorn Helgaas
@ 2024-08-27 23:48 ` Bjorn Helgaas
  2024-08-28  4:17   ` Lukas Wunner
  2024-09-11  0:42   ` Duc Dang
  2024-08-27 23:48 ` [RFC PATCH 2/3] PCI: aardvark: Correct Configuration RRS checking Bjorn Helgaas
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2024-08-27 23:48 UTC (permalink / raw)
  To: linux-pci
  Cc: Ilpo Järvinen, Maciej W . Rozycki, Mika Westerberg,
	Lukas Wunner, Rafael J . Wysocki, Mario Limonciello, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas

From: Bjorn Helgaas <bhelgaas@google.com>

After a device reset, delays are required before the device can
successfully complete config accesses.  PCIe r6.0, sec 6.6, specifies some
delays required before software can perform config accesses.  Devices that
require more time after those delays may respond to config accesses with
Configuration Request Retry Status (RRS) completions.

Callers of pci_dev_wait() are responsible for delays until the device can
respond to config accesses.  pci_dev_wait() waits any additional time until
the device can successfully complete config accesses.

Reading config space of devices that are not present or not ready typically
returns ~0 (PCI_ERROR_RESPONSE).  Previously we polled the Command register
until we got a value other than ~0.  This is sometimes a problem because
Root Complex handling of RRS completions may include several retries and
implementation-specific behavior that is invisible to software (see sec
2.3.2), so the exponential backoff in pci_dev_wait() may not work as
intended.

Linux enables Configuration RRS Software Visibility on all Root Ports that
support it.  If it is enabled, read the Vendor ID instead of the Command
register.  RRS completions cause immediate return of the 0x0001 reserved
Vendor ID value, so the pci_dev_wait() backoff works correctly.

When a read of Vendor ID eventually completes successfully by returning a
non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be
initialized and ready to respond to config requests.

For conventional PCI devices or devices below Root Ports that don't support
Configuration RRS Software Visibility, poll the Command register as before.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/pci.c   | 41 ++++++++++++++++++++++++++++-------------
 drivers/pci/pci.h   |  5 +++++
 drivers/pci/probe.c |  9 +++------
 include/linux/pci.h |  1 +
 4 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e3a49f66982d..fc2ecb7fe081 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1283,7 +1283,9 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
 {
 	int delay = 1;
 	bool retrain = false;
-	struct pci_dev *bridge;
+	struct pci_dev *root, *bridge;
+
+	root = pcie_find_root_port(dev);
 
 	if (pci_is_pcie(dev)) {
 		bridge = pci_upstream_bridge(dev);
@@ -1292,16 +1294,23 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
 	}
 
 	/*
-	 * After reset, the device should not silently discard config
-	 * requests, but it may still indicate that it needs more time by
-	 * responding to them with CRS completions.  The Root Port will
-	 * generally synthesize ~0 (PCI_ERROR_RESPONSE) data to complete
-	 * the read (except when CRS SV is enabled and the read was for the
-	 * Vendor ID; in that case it synthesizes 0x0001 data).
+	 * The caller has already waited long enough after a reset that the
+	 * device should respond to config requests, but it may respond
+	 * with Request Retry Status (RRS) if it needs more time to
+	 * initialize.
 	 *
-	 * Wait for the device to return a non-CRS completion.  Read the
-	 * Command register instead of Vendor ID so we don't have to
-	 * contend with the CRS SV value.
+	 * If the device is below a Root Port with Configuration RRS
+	 * Software Visibility enabled, reading the Vendor ID returns a
+	 * special data value if the device responded with RRS.  Read the
+	 * Vendor ID until we get non-RRS status.
+	 *
+	 * If there's no Root Port or Configuration RRS Software Visibility
+	 * is not enabled, the device may still respond with RRS, but
+	 * hardware may retry the config request.  If no retries receive
+	 * Successful Completion, hardware generally synthesizes ~0
+	 * (PCI_ERROR_RESPONSE) data to complete the read.  Reading Vendor
+	 * ID for VFs and non-existent devices also returns ~0, so read the
+	 * Command register until it returns something other than ~0.
 	 */
 	for (;;) {
 		u32 id;
@@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
 			return -ENOTTY;
 		}
 
-		pci_read_config_dword(dev, PCI_COMMAND, &id);
-		if (!PCI_POSSIBLE_ERROR(id))
-			break;
+		if (root && root->config_crs_sv) {
+			pci_read_config_dword(dev, PCI_VENDOR_ID, &id);
+			if (!pci_bus_crs_vendor_id(id))
+				break;
+		} else {
+			pci_read_config_dword(dev, PCI_COMMAND, &id);
+			if (!PCI_POSSIBLE_ERROR(id))
+				break;
+		}
 
 		if (delay > timeout) {
 			pci_warn(dev, "not ready %dms after %s; giving up\n",
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 79c8398f3938..fa1997bc2667 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -139,6 +139,11 @@ bool pci_bridge_d3_possible(struct pci_dev *dev);
 void pci_bridge_d3_update(struct pci_dev *dev);
 int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type);
 
+static inline bool pci_bus_crs_vendor_id(u32 l)
+{
+	return (l & 0xffff) == PCI_VENDOR_ID_PCI_SIG;
+}
+
 static inline void pci_wakeup_event(struct pci_dev *dev)
 {
 	/* Wait 100 ms before the system can be put into a sleep state. */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b14b9876c030..b1615da9eb6b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1209,9 +1209,11 @@ static void pci_enable_crs(struct pci_dev *pdev)
 
 	/* Enable CRS Software Visibility if supported */
 	pcie_capability_read_word(pdev, PCI_EXP_RTCAP, &root_cap);
-	if (root_cap & PCI_EXP_RTCAP_CRSVIS)
+	if (root_cap & PCI_EXP_RTCAP_CRSVIS) {
 		pcie_capability_set_word(pdev, PCI_EXP_RTCTL,
 					 PCI_EXP_RTCTL_CRSSVE);
+		pdev->config_crs_sv = 1;
+	}
 }
 
 static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
@@ -2343,11 +2345,6 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-static bool pci_bus_crs_vendor_id(u32 l)
-{
-	return (l & 0xffff) == PCI_VENDOR_ID_PCI_SIG;
-}
-
 static bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l,
 			     int timeout)
 {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 4cf89a4b4cbc..121d8d94d6d0 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -371,6 +371,7 @@ struct pci_dev {
 					   can be generated */
 	unsigned int	pme_poll:1;	/* Poll device's PME status bit */
 	unsigned int	pinned:1;	/* Whether this dev is pinned */
+	unsigned int	config_crs_sv:1; /* Config CRS software visibility */
 	unsigned int	imm_ready:1;	/* Supports Immediate Readiness */
 	unsigned int	d1_support:1;	/* Low power state D1 is supported */
 	unsigned int	d2_support:1;	/* Low power state D2 is supported */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 2/3] PCI: aardvark: Correct Configuration RRS checking
  2024-08-27 23:48 [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Bjorn Helgaas
  2024-08-27 23:48 ` [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS Bjorn Helgaas
@ 2024-08-27 23:48 ` Bjorn Helgaas
  2024-08-27 23:48 ` [RFC PATCH 3/3] PCI: Rename CRS Completion Status to RRS Bjorn Helgaas
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2024-08-27 23:48 UTC (permalink / raw)
  To: linux-pci
  Cc: Ilpo Järvinen, Maciej W . Rozycki, Mika Westerberg,
	Lukas Wunner, Rafael J . Wysocki, Mario Limonciello, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas

From: Bjorn Helgaas <bhelgaas@google.com>

Per PCIe r6.0, sec 2.3.2, when a Root Complex handles a Completion with
Request Retry Status for a Configuration Read Request that includes both
bytes of the Vendor ID field, it must complete the Request to the host by
returning 0001h for the Vendor ID and all 1's for any additional bytes.

Previously we only returned the 0001h Vendor ID value if we got an RRS
completion for reads of exactly 4 bytes.  A read of 2 bytes would not
qualify, although the spec says it should.

Check for reads of 2 or more bytes including the Vendor ID.

I don't think this will fix any observable problems because RRS only
applies to the first config reads after reset, and those are all currently
dword (4-byte) reads.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/controller/pci-aardvark.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index 8b3e1a079cf3..e66594558ce2 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -1153,11 +1153,11 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
 						 size, val);
 
 	/*
-	 * Completion Retry Status is possible to return only when reading all
-	 * 4 bytes from PCI_VENDOR_ID and PCI_DEVICE_ID registers at once and
-	 * CRSSVE flag on Root Bridge is enabled.
+	 * Completion Retry Status is possible to return only when reading
+	 * both bytes from PCI_VENDOR_ID at once and CRSSVE flag on Root
+	 * Port is enabled.
 	 */
-	allow_crs = (where == PCI_VENDOR_ID) && (size == 4) &&
+	allow_crs = (where == PCI_VENDOR_ID) && (size >= 2) &&
 		    (le16_to_cpu(pcie->bridge.pcie_conf.rootctl) &
 		     PCI_EXP_RTCTL_CRSSVE);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 3/3] PCI: Rename CRS Completion Status to RRS
  2024-08-27 23:48 [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Bjorn Helgaas
  2024-08-27 23:48 ` [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS Bjorn Helgaas
  2024-08-27 23:48 ` [RFC PATCH 2/3] PCI: aardvark: Correct Configuration RRS checking Bjorn Helgaas
@ 2024-08-27 23:48 ` Bjorn Helgaas
  2024-08-28 21:24 ` [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Mario Limonciello
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2024-08-27 23:48 UTC (permalink / raw)
  To: linux-pci
  Cc: Ilpo Järvinen, Maciej W . Rozycki, Mika Westerberg,
	Lukas Wunner, Rafael J . Wysocki, Mario Limonciello, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas

From: Bjorn Helgaas <bhelgaas@google.com>

PCIe r6.0 changed the abbreviation for "Configuration Request Retry Status"
Completion Status from "CRS" to "RRS" and uses the terminology of
"Configuration RRS Software Visibility" instead of "CRS Software
Visibility".

Align the Linux usage with the r6.0 spec language.  No functional change
intended.

It's confusing to make this change, but I think "RRS" *is* a better
abbreviation because it was easy to interpret "CRS" as "Completion Retry
Status", which really didn't make any sense.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/bcma/driver_pci_host.c             | 10 ++--
 drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
 drivers/pci/controller/pci-aardvark.c      | 64 +++++++++++-----------
 drivers/pci/controller/pci-xgene.c         |  6 +-
 drivers/pci/controller/pcie-iproc.c        | 18 +++---
 drivers/pci/pci-bridge-emul.c              |  4 +-
 drivers/pci/pci.c                          |  4 +-
 drivers/pci/pci.h                          |  8 +--
 drivers/pci/probe.c                        | 28 +++++-----
 include/linux/bcma/bcma_driver_pci.h       |  2 +-
 include/linux/pci.h                        |  2 +-
 include/uapi/linux/pci_regs.h              |  6 +-
 12 files changed, 86 insertions(+), 84 deletions(-)

diff --git a/drivers/bcma/driver_pci_host.c b/drivers/bcma/driver_pci_host.c
index ed3be52ab63d..8540052d37c5 100644
--- a/drivers/bcma/driver_pci_host.c
+++ b/drivers/bcma/driver_pci_host.c
@@ -334,7 +334,7 @@ static u8 bcma_find_pci_capability(struct bcma_drv_pci *pc, unsigned int dev,
 }
 
 /* If the root port is capable of returning Config Request
- * Retry Status (CRS) Completion Status to software then
+ * Retry Status (RRS) Completion Status to software then
  * enable the feature.
  */
 static void bcma_core_pci_enable_crs(struct bcma_drv_pci *pc)
@@ -348,10 +348,10 @@ static void bcma_core_pci_enable_crs(struct bcma_drv_pci *pc)
 					   NULL);
 	root_cap = cap_ptr + PCI_EXP_RTCAP;
 	bcma_extpci_read_config(pc, 0, 0, root_cap, &val16, sizeof(u16));
-	if (val16 & BCMA_CORE_PCI_RC_CRS_VISIBILITY) {
-		/* Enable CRS software visibility */
+	if (val16 & BCMA_CORE_PCI_RC_RRS_VISIBILITY) {
+		/* Enable Configuration RRS Software Visibility */
 		root_ctrl = cap_ptr + PCI_EXP_RTCTL;
-		val16 = PCI_EXP_RTCTL_CRSSVE;
+		val16 = PCI_EXP_RTCTL_RRS_SVE;
 		bcma_extpci_read_config(pc, 0, 0, root_ctrl, &val16,
 					sizeof(u16));
 
@@ -360,7 +360,7 @@ static void bcma_core_pci_enable_crs(struct bcma_drv_pci *pc)
 		 * 100 ms wait time from the end of Reset. If the device is
 		 * not done with its internal initialization, it must at
 		 * least return a completion TLP, with a completion status
-		 * of "Configuration Request Retry Status (CRS)". The root
+		 * of "Configuration Request Retry Status (RRS)". The root
 		 * complex must complete the request to the host by returning
 		 * a read-data value of 0001h for the Vendor ID field and
 		 * all 1s for any additional bytes included in the request.
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c
index 4bf7b433417a..886354342ef1 100644
--- a/drivers/pci/controller/dwc/pcie-tegra194.c
+++ b/drivers/pci/controller/dwc/pcie-tegra194.c
@@ -183,11 +183,11 @@
 #define GEN3_EQ_CONTROL_OFF_FB_MODE_MASK	GENMASK(3, 0)
 
 #define PORT_LOGIC_AMBA_ERROR_RESPONSE_DEFAULT	0x8D0
-#define AMBA_ERROR_RESPONSE_CRS_SHIFT		3
-#define AMBA_ERROR_RESPONSE_CRS_MASK		GENMASK(1, 0)
-#define AMBA_ERROR_RESPONSE_CRS_OKAY		0
-#define AMBA_ERROR_RESPONSE_CRS_OKAY_FFFFFFFF	1
-#define AMBA_ERROR_RESPONSE_CRS_OKAY_FFFF0001	2
+#define AMBA_ERROR_RESPONSE_RRS_SHIFT		3
+#define AMBA_ERROR_RESPONSE_RRS_MASK		GENMASK(1, 0)
+#define AMBA_ERROR_RESPONSE_RRS_OKAY		0
+#define AMBA_ERROR_RESPONSE_RRS_OKAY_FFFFFFFF	1
+#define AMBA_ERROR_RESPONSE_RRS_OKAY_FFFF0001	2
 
 #define MSIX_ADDR_MATCH_LOW_OFF			0x940
 #define MSIX_ADDR_MATCH_LOW_OFF_EN		BIT(0)
@@ -907,11 +907,11 @@ static int tegra_pcie_dw_host_init(struct dw_pcie_rp *pp)
 
 	dw_pcie_writel_dbi(pci, PCI_BASE_ADDRESS_0, 0);
 
-	/* Enable as 0xFFFF0001 response for CRS */
+	/* Enable as 0xFFFF0001 response for RRS */
 	val = dw_pcie_readl_dbi(pci, PORT_LOGIC_AMBA_ERROR_RESPONSE_DEFAULT);
-	val &= ~(AMBA_ERROR_RESPONSE_CRS_MASK << AMBA_ERROR_RESPONSE_CRS_SHIFT);
-	val |= (AMBA_ERROR_RESPONSE_CRS_OKAY_FFFF0001 <<
-		AMBA_ERROR_RESPONSE_CRS_SHIFT);
+	val &= ~(AMBA_ERROR_RESPONSE_RRS_MASK << AMBA_ERROR_RESPONSE_RRS_SHIFT);
+	val |= (AMBA_ERROR_RESPONSE_RRS_OKAY_FFFF0001 <<
+		AMBA_ERROR_RESPONSE_RRS_SHIFT);
 	dw_pcie_writel_dbi(pci, PORT_LOGIC_AMBA_ERROR_RESPONSE_DEFAULT, val);
 
 	/* Clear Slot Clock Configuration bit if SRNS configuration */
diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c
index e66594558ce2..afccae3bfbc6 100644
--- a/drivers/pci/controller/pci-aardvark.c
+++ b/drivers/pci/controller/pci-aardvark.c
@@ -50,7 +50,7 @@
 #define   PIO_COMPLETION_STATUS_MASK		GENMASK(9, 7)
 #define   PIO_COMPLETION_STATUS_OK		0
 #define   PIO_COMPLETION_STATUS_UR		1
-#define   PIO_COMPLETION_STATUS_CRS		2
+#define   PIO_COMPLETION_STATUS_RRS		2
 #define   PIO_COMPLETION_STATUS_CA		4
 #define   PIO_NON_POSTED_REQ			BIT(10)
 #define   PIO_ERR_STATUS			BIT(11)
@@ -262,7 +262,7 @@ enum {
 
 #define MSI_IRQ_NUM			32
 
-#define CFG_RD_CRS_VAL			0xffff0001
+#define CFG_RD_RRS_VAL			0xffff0001
 
 struct advk_pcie {
 	struct platform_device *pdev;
@@ -649,7 +649,7 @@ static void advk_pcie_setup_hw(struct advk_pcie *pcie)
 	advk_pcie_train_link(pcie);
 }
 
-static int advk_pcie_check_pio_status(struct advk_pcie *pcie, bool allow_crs, u32 *val)
+static int advk_pcie_check_pio_status(struct advk_pcie *pcie, bool allow_rrs, u32 *val)
 {
 	struct device *dev = &pcie->pdev->dev;
 	u32 reg;
@@ -669,7 +669,7 @@ static int advk_pcie_check_pio_status(struct advk_pcie *pcie, bool allow_crs, u3
 	 * 2) value Unsupported Request(1) of COMPLETION_STATUS(bit9:7) only
 	 *    means a PIO write error, and for PIO read it is successful with
 	 *    a read value of 0xFFFFFFFF.
-	 * 3) value Completion Retry Status(CRS) of COMPLETION_STATUS(bit9:7)
+	 * 3) value Config Request Retry Status(RRS) of COMPLETION_STATUS(bit9:7)
 	 *    only means a PIO write error, and for PIO read it is successful
 	 *    with a read value of 0xFFFF0001.
 	 * 4) value Completer Abort (CA) of COMPLETION_STATUS(bit9:7) means
@@ -694,10 +694,10 @@ static int advk_pcie_check_pio_status(struct advk_pcie *pcie, bool allow_crs, u3
 		strcomp_status = "UR";
 		ret = -EOPNOTSUPP;
 		break;
-	case PIO_COMPLETION_STATUS_CRS:
-		if (allow_crs && val) {
-			/* PCIe r4.0, sec 2.3.2, says:
-			 * If CRS Software Visibility is enabled:
+	case PIO_COMPLETION_STATUS_RRS:
+		if (allow_rrs && val) {
+			/* PCIe r6.0, sec 2.3.2, says:
+			 * If Configuration RRS Software Visibility is enabled:
 			 * For a Configuration Read Request that includes both
 			 * bytes of the Vendor ID field of a device Function's
 			 * Configuration Space Header, the Root Complex must
@@ -706,22 +706,22 @@ static int advk_pcie_check_pio_status(struct advk_pcie *pcie, bool allow_crs, u3
 			 * all '1's for any additional bytes included in the
 			 * request.
 			 *
-			 * So CRS in this case is not an error status.
+			 * So RRS in this case is not an error status.
 			 */
-			*val = CFG_RD_CRS_VAL;
+			*val = CFG_RD_RRS_VAL;
 			strcomp_status = NULL;
 			ret = 0;
 			break;
 		}
-		/* PCIe r4.0, sec 2.3.2, says:
-		 * If CRS Software Visibility is not enabled, the Root Complex
+		/* PCIe r6.0, sec 2.3.2, says:
+		 * If RRS Software Visibility is not enabled, the Root Complex
 		 * must re-issue the Configuration Request as a new Request.
-		 * If CRS Software Visibility is enabled: For a Configuration
+		 * If RRS Software Visibility is enabled: For a Configuration
 		 * Write Request or for any other Configuration Read Request,
 		 * the Root Complex must re-issue the Configuration Request as
 		 * a new Request.
 		 * A Root Complex implementation may choose to limit the number
-		 * of Configuration Request/CRS Completion Status loops before
+		 * of Configuration Request/RRS Completion Status loops before
 		 * determining that something is wrong with the target of the
 		 * Request and taking appropriate action, e.g., complete the
 		 * Request to the host as a failed transaction.
@@ -729,7 +729,7 @@ static int advk_pcie_check_pio_status(struct advk_pcie *pcie, bool allow_crs, u3
 		 * So return -EAGAIN and caller (pci-aardvark.c driver) will
 		 * re-issue request again up to the PIO_RETRY_CNT retries.
 		 */
-		strcomp_status = "CRS";
+		strcomp_status = "RRS";
 		ret = -EAGAIN;
 		break;
 	case PIO_COMPLETION_STATUS_CA:
@@ -920,8 +920,8 @@ advk_pci_bridge_emul_pcie_conf_write(struct pci_bridge_emul *bridge,
 
 	case PCI_EXP_RTCTL: {
 		u16 rootctl = le16_to_cpu(bridge->pcie_conf.rootctl);
-		/* Only emulation of PMEIE and CRSSVE bits is provided */
-		rootctl &= PCI_EXP_RTCTL_PMEIE | PCI_EXP_RTCTL_CRSSVE;
+		/* Only emulation of PMEIE and RRS_SVE bits is provided */
+		rootctl &= PCI_EXP_RTCTL_PMEIE | PCI_EXP_RTCTL_RRS_SVE;
 		bridge->pcie_conf.rootctl = cpu_to_le16(rootctl);
 		break;
 	}
@@ -1075,7 +1075,7 @@ static int advk_sw_pci_bridge_init(struct advk_pcie *pcie)
 	bridge->pcie_conf.slotsta = cpu_to_le16(PCI_EXP_SLTSTA_PDS);
 
 	/* Indicates supports for Completion Retry Status */
-	bridge->pcie_conf.rootcap = cpu_to_le16(PCI_EXP_RTCAP_CRSVIS);
+	bridge->pcie_conf.rootcap = cpu_to_le16(PCI_EXP_RTCAP_RRS_SV);
 
 	bridge->subsystem_vendor_id = advk_readl(pcie, PCIE_CORE_SSDEV_ID_REG) & 0xffff;
 	bridge->subsystem_id = advk_readl(pcie, PCIE_CORE_SSDEV_ID_REG) >> 16;
@@ -1141,7 +1141,7 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
 {
 	struct advk_pcie *pcie = bus->sysdata;
 	int retry_count;
-	bool allow_crs;
+	bool allow_rrs;
 	u32 reg;
 	int ret;
 
@@ -1153,16 +1153,16 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
 						 size, val);
 
 	/*
-	 * Completion Retry Status is possible to return only when reading
-	 * both bytes from PCI_VENDOR_ID at once and CRSSVE flag on Root
-	 * Port is enabled.
+	 * Configuration Request Retry Status (RRS) is possible to return
+	 * only when reading both bytes from PCI_VENDOR_ID at once and
+	 * RRS_SVE flag on Root Port is enabled.
 	 */
-	allow_crs = (where == PCI_VENDOR_ID) && (size >= 2) &&
+	allow_rrs = (where == PCI_VENDOR_ID) && (size >= 2) &&
 		    (le16_to_cpu(pcie->bridge.pcie_conf.rootctl) &
-		     PCI_EXP_RTCTL_CRSSVE);
+		     PCI_EXP_RTCTL_RRS_SVE);
 
 	if (advk_pcie_pio_is_running(pcie))
-		goto try_crs;
+		goto try_rrs;
 
 	/* Program the control register */
 	reg = advk_readl(pcie, PIO_CTRL);
@@ -1189,12 +1189,12 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
 
 		ret = advk_pcie_wait_pio(pcie);
 		if (ret < 0)
-			goto try_crs;
+			goto try_rrs;
 
 		retry_count += ret;
 
 		/* Check PIO status and get the read result */
-		ret = advk_pcie_check_pio_status(pcie, allow_crs, val);
+		ret = advk_pcie_check_pio_status(pcie, allow_rrs, val);
 	} while (ret == -EAGAIN && retry_count < PIO_RETRY_CNT);
 
 	if (ret < 0)
@@ -1207,13 +1207,13 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
 
 	return PCIBIOS_SUCCESSFUL;
 
-try_crs:
+try_rrs:
 	/*
-	 * If it is possible, return Completion Retry Status so that caller
-	 * tries to issue the request again instead of failing.
+	 * If it is possible, return Configuration Request Retry Status so
+	 * that caller tries to issue the request again instead of failing.
 	 */
-	if (allow_crs) {
-		*val = CFG_RD_CRS_VAL;
+	if (allow_rrs) {
+		*val = CFG_RD_RRS_VAL;
 		return PCIBIOS_SUCCESSFUL;
 	}
 
diff --git a/drivers/pci/controller/pci-xgene.c b/drivers/pci/controller/pci-xgene.c
index 8e457fa450a2..1e2ebbfa36d1 100644
--- a/drivers/pci/controller/pci-xgene.c
+++ b/drivers/pci/controller/pci-xgene.c
@@ -171,17 +171,17 @@ static int xgene_pcie_config_read32(struct pci_bus *bus, unsigned int devfn,
 
 	/*
 	 * The v1 controller has a bug in its Configuration Request Retry
-	 * Status (CRS) logic: when CRS Software Visibility is enabled and
+	 * Status (RRS) logic: when RRS Software Visibility is enabled and
 	 * we read the Vendor and Device ID of a non-existent device, the
 	 * controller fabricates return data of 0xFFFF0001 ("device exists
 	 * but is not ready") instead of 0xFFFFFFFF (PCI_ERROR_RESPONSE)
 	 * ("device does not exist").  This causes the PCI core to retry
 	 * the read until it times out.  Avoid this by not claiming to
-	 * support CRS SV.
+	 * support RRS SV.
 	 */
 	if (pci_is_root_bus(bus) && (port->version == XGENE_PCIE_IP_VER_1) &&
 	    ((where & ~0x3) == XGENE_V1_PCI_EXP_CAP + PCI_EXP_RTCTL))
-		*val &= ~(PCI_EXP_RTCAP_CRSVIS << 16);
+		*val &= ~(PCI_EXP_RTCAP_RRS_SV << 16);
 
 	if (size <= 2)
 		*val = (*val >> (8 * (where & 3))) & ((1 << (size * 8)) - 1);
diff --git a/drivers/pci/controller/pcie-iproc.c b/drivers/pci/controller/pcie-iproc.c
index 97f739a2c9f8..22134e95574b 100644
--- a/drivers/pci/controller/pcie-iproc.c
+++ b/drivers/pci/controller/pcie-iproc.c
@@ -54,7 +54,7 @@
 
 #define CFG_RD_SUCCESS			0
 #define CFG_RD_UR			1
-#define CFG_RD_CRS			2
+#define CFG_RD_RRS			2
 #define CFG_RD_CA			3
 #define CFG_RETRY_STATUS		0xffff0001
 #define CFG_RETRY_STATUS_TIMEOUT_US	500000 /* 500 milliseconds */
@@ -485,31 +485,31 @@ static unsigned int iproc_pcie_cfg_retry(struct iproc_pcie *pcie,
 	u32 status;
 
 	/*
-	 * As per PCIe spec r3.1, sec 2.3.2, CRS Software Visibility only
+	 * As per PCIe r6.0, sec 2.3.2, Config RRS Software Visibility only
 	 * affects config reads of the Vendor ID.  For config writes or any
 	 * other config reads, the Root may automatically reissue the
 	 * configuration request again as a new request.
 	 *
 	 * For config reads, this hardware returns CFG_RETRY_STATUS data
-	 * when it receives a CRS completion, regardless of the address of
-	 * the read or the CRS Software Visibility Enable bit.  As a
+	 * when it receives a RRS completion, regardless of the address of
+	 * the read or the RRS Software Visibility Enable bit.  As a
 	 * partial workaround for this, we retry in software any read that
 	 * returns CFG_RETRY_STATUS.
 	 *
 	 * Note that a non-Vendor ID config register may have a value of
 	 * CFG_RETRY_STATUS.  If we read that, we can't distinguish it from
-	 * a CRS completion, so we will incorrectly retry the read and
+	 * a RRS completion, so we will incorrectly retry the read and
 	 * eventually return the wrong data (0xffffffff).
 	 */
 	data = readl(cfg_data_p);
 	while (data == CFG_RETRY_STATUS && timeout--) {
 		/*
-		 * CRS state is set in CFG_RD status register
+		 * RRS state is set in CFG_RD status register
 		 * This will handle the case where CFG_RETRY_STATUS is
 		 * valid config data.
 		 */
 		status = iproc_pcie_read_reg(pcie, IPROC_PCIE_CFG_RD_STATUS);
-		if (status != CFG_RD_CRS)
+		if (status != CFG_RD_RRS)
 			return data;
 
 		udelay(1);
@@ -556,8 +556,8 @@ static void iproc_pcie_fix_cap(struct iproc_pcie *pcie, int where, u32 *val)
 		break;
 
 	case IPROC_PCI_EXP_CAP + PCI_EXP_RTCTL:
-		/* Don't advertise CRS SV support */
-		*val &= ~(PCI_EXP_RTCAP_CRSVIS << 16);
+		/* Don't advertise RRS SV support */
+		*val &= ~(PCI_EXP_RTCAP_RRS_SV << 16);
 		break;
 
 	default:
diff --git a/drivers/pci/pci-bridge-emul.c b/drivers/pci/pci-bridge-emul.c
index 9334b2dd4764..6658c1edd464 100644
--- a/drivers/pci/pci-bridge-emul.c
+++ b/drivers/pci/pci-bridge-emul.c
@@ -257,8 +257,8 @@ struct pci_bridge_reg_behavior pcie_cap_regs_behavior[PCI_CAP_PCIE_SIZEOF / 4] =
 		 */
 		.rw = (PCI_EXP_RTCTL_SECEE | PCI_EXP_RTCTL_SENFEE |
 		       PCI_EXP_RTCTL_SEFEE | PCI_EXP_RTCTL_PMEIE |
-		       PCI_EXP_RTCTL_CRSSVE),
-		.ro = PCI_EXP_RTCAP_CRSVIS << 16,
+		       PCI_EXP_RTCTL_RRS_SVE),
+		.ro = PCI_EXP_RTCAP_RRS_SV << 16,
 	},
 
 	[PCI_EXP_RTSTA / 4] = {
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index fc2ecb7fe081..55d6124acb2d 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1320,9 +1320,9 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
 			return -ENOTTY;
 		}
 
-		if (root && root->config_crs_sv) {
+		if (root && root->config_rrs_sv) {
 			pci_read_config_dword(dev, PCI_VENDOR_ID, &id);
-			if (!pci_bus_crs_vendor_id(id))
+			if (!pci_bus_rrs_vendor_id(id))
 				break;
 		} else {
 			pci_read_config_dword(dev, PCI_COMMAND, &id);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index fa1997bc2667..061cc1b48fd8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -139,7 +139,7 @@ bool pci_bridge_d3_possible(struct pci_dev *dev);
 void pci_bridge_d3_update(struct pci_dev *dev);
 int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type);
 
-static inline bool pci_bus_crs_vendor_id(u32 l)
+static inline bool pci_bus_rrs_vendor_id(u32 l)
 {
 	return (l & 0xffff) == PCI_VENDOR_ID_PCI_SIG;
 }
@@ -295,10 +295,10 @@ void pci_put_host_bridge_device(struct device *dev);
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
-				int crs_timeout);
+				int rrs_timeout);
 bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
-					int crs_timeout);
-int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *pl, int crs_timeout);
+					int rrs_timeout);
+int pci_idt_bus_quirk(struct pci_bus *bus, int devfn, u32 *pl, int rrs_timeout);
 
 int pci_setup_device(struct pci_dev *dev);
 int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index b1615da9eb6b..e79d9bea32bf 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1203,16 +1203,16 @@ struct pci_bus *pci_add_new_bus(struct pci_bus *parent, struct pci_dev *dev,
 }
 EXPORT_SYMBOL(pci_add_new_bus);
 
-static void pci_enable_crs(struct pci_dev *pdev)
+static void pci_enable_rrs_sv(struct pci_dev *pdev)
 {
 	u16 root_cap = 0;
 
-	/* Enable CRS Software Visibility if supported */
+	/* Enable Configuration RRS Software Visibility if supported */
 	pcie_capability_read_word(pdev, PCI_EXP_RTCAP, &root_cap);
-	if (root_cap & PCI_EXP_RTCAP_CRSVIS) {
+	if (root_cap & PCI_EXP_RTCAP_RRS_SV) {
 		pcie_capability_set_word(pdev, PCI_EXP_RTCTL,
-					 PCI_EXP_RTCTL_CRSSVE);
-		pdev->config_crs_sv = 1;
+					 PCI_EXP_RTCTL_RRS_SVE);
+		pdev->config_rrs_sv = 1;
 	}
 }
 
@@ -1328,7 +1328,7 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev,
 	pci_write_config_word(dev, PCI_BRIDGE_CONTROL,
 			      bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
 
-	pci_enable_crs(dev);
+	pci_enable_rrs_sv(dev);
 
 	if ((secondary || subordinate) && !pcibios_assign_all_busses() &&
 	    !is_cardbus && !broken) {
@@ -2345,23 +2345,23 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
 }
 EXPORT_SYMBOL(pci_alloc_dev);
 
-static bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l,
+static bool pci_bus_wait_rrs(struct pci_bus *bus, int devfn, u32 *l,
 			     int timeout)
 {
 	int delay = 1;
 
-	if (!pci_bus_crs_vendor_id(*l))
-		return true;	/* not a CRS completion */
+	if (!pci_bus_rrs_vendor_id(*l))
+		return true;	/* not a Configuration RRS completion */
 
 	if (!timeout)
-		return false;	/* CRS, but caller doesn't want to wait */
+		return false;	/* RRS, but caller doesn't want to wait */
 
 	/*
 	 * We got the reserved Vendor ID that indicates a completion with
-	 * Configuration Request Retry Status (CRS).  Retry until we get a
+	 * Configuration Request Retry Status (RRS).  Retry until we get a
 	 * valid Vendor ID or we time out.
 	 */
-	while (pci_bus_crs_vendor_id(*l)) {
+	while (pci_bus_rrs_vendor_id(*l)) {
 		if (delay > timeout) {
 			pr_warn("pci %04x:%02x:%02x.%d: not ready after %dms; giving up\n",
 				pci_domain_nr(bus), bus->number,
@@ -2400,8 +2400,8 @@ bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *l,
 	    *l == 0x0000ffff || *l == 0xffff0000)
 		return false;
 
-	if (pci_bus_crs_vendor_id(*l))
-		return pci_bus_wait_crs(bus, devfn, l, timeout);
+	if (pci_bus_rrs_vendor_id(*l))
+		return pci_bus_wait_rrs(bus, devfn, l, timeout);
 
 	return true;
 }
diff --git a/include/linux/bcma/bcma_driver_pci.h b/include/linux/bcma/bcma_driver_pci.h
index 68da8dba5162..dba41b65ae0d 100644
--- a/include/linux/bcma/bcma_driver_pci.h
+++ b/include/linux/bcma/bcma_driver_pci.h
@@ -203,7 +203,7 @@ struct pci_dev;
 #define BCMA_CORE_PCI_MDIO_RXCTRL0		0x840
 
 /* PCIE Root Capability Register bits (Host mode only) */
-#define BCMA_CORE_PCI_RC_CRS_VISIBILITY		0x0001
+#define BCMA_CORE_PCI_RC_RRS_VISIBILITY		0x0001
 
 struct bcma_drv_pci;
 struct bcma_bus;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 121d8d94d6d0..27d8ce977674 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -371,7 +371,7 @@ struct pci_dev {
 					   can be generated */
 	unsigned int	pme_poll:1;	/* Poll device's PME status bit */
 	unsigned int	pinned:1;	/* Whether this dev is pinned */
-	unsigned int	config_crs_sv:1; /* Config CRS software visibility */
+	unsigned int	config_rrs_sv:1; /* Config RRS software visibility */
 	unsigned int	imm_ready:1;	/* Supports Immediate Readiness */
 	unsigned int	d1_support:1;	/* Low power state D1 is supported */
 	unsigned int	d2_support:1;	/* Low power state D2 is supported */
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 94c00996e633..f94591f9f5e9 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -634,9 +634,11 @@
 #define  PCI_EXP_RTCTL_SENFEE	0x0002	/* System Error on Non-Fatal Error */
 #define  PCI_EXP_RTCTL_SEFEE	0x0004	/* System Error on Fatal Error */
 #define  PCI_EXP_RTCTL_PMEIE	0x0008	/* PME Interrupt Enable */
-#define  PCI_EXP_RTCTL_CRSSVE	0x0010	/* CRS Software Visibility Enable */
+#define  PCI_EXP_RTCTL_RRS_SVE	0x0010	/* Config RRS Software Visibility Enable */
+#define  PCI_EXP_RTCTL_CRSSVE PCI_EXP_RTCTL_RRS_SVE /* compatibility */
 #define PCI_EXP_RTCAP		0x1e	/* Root Capabilities */
-#define  PCI_EXP_RTCAP_CRSVIS	0x0001	/* CRS Software Visibility capability */
+#define  PCI_EXP_RTCAP_RRS_SV	0x0001	/* Config RRS Software Visibility */
+#define  PCI_EXP_RTCAP_CRSVIS PCI_EXP_RTCAP_RRS_SV /* compatibility */
 #define PCI_EXP_RTSTA		0x20	/* Root Status */
 #define  PCI_EXP_RTSTA_PME_RQ_ID 0x0000ffff /* PME Requester ID */
 #define  PCI_EXP_RTSTA_PME	0x00010000 /* PME status */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS
  2024-08-27 23:48 ` [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS Bjorn Helgaas
@ 2024-08-28  4:17   ` Lukas Wunner
  2024-08-28 20:53     ` Bjorn Helgaas
  2024-09-11  0:42   ` Duc Dang
  1 sibling, 1 reply; 13+ messages in thread
From: Lukas Wunner @ 2024-08-28  4:17 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Ilpo Järvinen, Maciej W . Rozycki,
	Mika Westerberg, Rafael J . Wysocki, Mario Limonciello, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas

On Tue, Aug 27, 2024 at 06:48:46PM -0500, Bjorn Helgaas wrote:
> @@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
>  			return -ENOTTY;
>  		}
>  
> -		pci_read_config_dword(dev, PCI_COMMAND, &id);
> -		if (!PCI_POSSIBLE_ERROR(id))
> -			break;
> +		if (root && root->config_crs_sv) {
> +			pci_read_config_dword(dev, PCI_VENDOR_ID, &id);
> +			if (!pci_bus_crs_vendor_id(id))
> +				break;

There was an effort from Amazon back in 2020/2021 to improve CRS support:

https://lore.kernel.org/linux-pci/20200307172044.29645-1-stanspas@amazon.com/

One suggestion you raised in the subsequent discussion was to use a
16-bit (word) instead of a 32-bit (dword) read of the Vendor ID
register to avoid issues with devices that don't implement CRS SV
correctly:

https://lore.kernel.org/linux-pci/20210913160745.GA1329939@bjorn-Precision-5520/

I realize that pci_bus_crs_vendor_id() masks the Device ID bits,
so probably no biggie.  Just want to make sure all lessons learned
during previous discussions on this topic are considered. :)

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS
  2024-08-28  4:17   ` Lukas Wunner
@ 2024-08-28 20:53     ` Bjorn Helgaas
  0 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2024-08-28 20:53 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: linux-pci, Ilpo Järvinen, Maciej W . Rozycki,
	Mika Westerberg, Rafael J . Wysocki, Mario Limonciello, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas, Stanislav Spassov,
	Rajat Jain

[+cc Stanislav, Rajat]

On Wed, Aug 28, 2024 at 06:17:04AM +0200, Lukas Wunner wrote:
> On Tue, Aug 27, 2024 at 06:48:46PM -0500, Bjorn Helgaas wrote:
> > @@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
> >  			return -ENOTTY;
> >  		}
> >  
> > -		pci_read_config_dword(dev, PCI_COMMAND, &id);
> > -		if (!PCI_POSSIBLE_ERROR(id))
> > -			break;
> > +		if (root && root->config_crs_sv) {
> > +			pci_read_config_dword(dev, PCI_VENDOR_ID, &id);
> > +			if (!pci_bus_crs_vendor_id(id))
> > +				break;
> 
> There was an effort from Amazon back in 2020/2021 to improve CRS support:
> 
> https://lore.kernel.org/linux-pci/20200307172044.29645-1-stanspas@amazon.com/

Thanks for reminding me of that, and I'm sorry that series didn't get
applied back then because it's very similar to this one.

> One suggestion you raised in the subsequent discussion was to use a
> 16-bit (word) instead of a 32-bit (dword) read of the Vendor ID
> register to avoid issues with devices that don't implement CRS SV
> correctly:
> 
> https://lore.kernel.org/linux-pci/20210913160745.GA1329939@bjorn-Precision-5520/

Thanks.  Reading that response, I don't understand my point about using
a 16-bit read.  I mentioned 89665a6a7140 ("PCI: Check only the Vendor
ID to identify Configuration Request Retry"), the commit log of which
points to http://lkml.kernel.org/r/4729FC36.3040000@gmail.com, which
documents several defective devices that have a Vendor ID of 0x0001.

E.g., the Realtek rtl8169 controller mentioned there has Vendor/Device
ID of [0001:8168].  Doing either a 16-bit read or a 32-bit read and
checking the low 16 bits would incorrectly treat that as a Config RRS
completion.

So it *looks* to me like we will time out after 60 seconds and
conclude the device never became ready:

  pci_scan_device(..., timeout=60*1000)
    pci_bus_read_dev_vendor_id
      pci_bus_generic_read_dev_vendor_id
        pci_bus_read_config_dword(PCI_VENDOR_ID, l)  # <--
        # *l == 0x81680001
        if (pci_bus_crs_vendor_id(*l))        # 0x81680001 & 0xffff = 0x0001
          pci_bus_wait_crs(..., timeout=60*1000)
            while (pci_bus_crs_vendor_id(*l)) {
              if (delay > timeout)
                return false;                 # device not ready
              pci_bus_read_config_dword(PCI_VENDOR_ID, l)
            }

That *might* be an argument for doing a 32-bit read and checking for
0xffff0001, since the spec requires all 1's in the extra bytes.  But
I'm not aware of any complaints about these broken devices with a
0x0001 Vendor ID, and the 89665a6a7140 commit log says there are also
defective devices that don't fill the extra bytes with all 1's.

So my inclination is to keep the current code that does a 32-bit read
and checks only the low 16 bits.

> I realize that pci_bus_crs_vendor_id() masks the Device ID bits,
> so probably no biggie.  Just want to make sure all lessons learned
> during previous discussions on this topic are considered. :)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready
  2024-08-27 23:48 [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Bjorn Helgaas
                   ` (2 preceding siblings ...)
  2024-08-27 23:48 ` [RFC PATCH 3/3] PCI: Rename CRS Completion Status to RRS Bjorn Helgaas
@ 2024-08-28 21:24 ` Mario Limonciello
  2024-08-28 21:42   ` Bjorn Helgaas
  2024-09-10  0:59 ` Bjorn Helgaas
  2024-09-10 22:55 ` Bjorn Helgaas
  5 siblings, 1 reply; 13+ messages in thread
From: Mario Limonciello @ 2024-08-28 21:24 UTC (permalink / raw)
  To: Bjorn Helgaas, linux-pci
  Cc: Ilpo Järvinen, Maciej W . Rozycki, Mika Westerberg,
	Lukas Wunner, Rafael J . Wysocki, Duc Dang, Alex Williamson,
	linux-kernel, Bjorn Helgaas, Li, Gary

On 8/27/2024 18:48, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> After a device reset, pci_dev_wait() waits for a device to become
> completely ready by polling the PCI_COMMAND register.  The spec envisions
> that software would instead poll for the device to stop responding to
> config reads with Completions with Request Retry Status (RRS).
> 
> Polling PCI_COMMAND leads to hardware retries that are invisible to
> software and the backoff between software retries doesn't work correctly.
> 
> Root Ports are not required to support the Configuration RRS Software
> Visibility feature that prevents hardware retries and makes the RRS
> Completions visible to software, so this series only uses it when available
> and falls back to PCI_COMMAND polling when it's not.
> 
> This is completely untested and posted for comments.
> 
> Bjorn Helgaas (3):
>    PCI: Wait for device readiness with Configuration RRS
>    PCI: aardvark: Correct Configuration RRS checking
>    PCI: Rename CRS Completion Status to RRS
> 
>   drivers/bcma/driver_pci_host.c             | 10 ++--
>   drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
>   drivers/pci/controller/pci-aardvark.c      | 64 +++++++++++-----------
>   drivers/pci/controller/pci-xgene.c         |  6 +-
>   drivers/pci/controller/pcie-iproc.c        | 18 +++---
>   drivers/pci/pci-bridge-emul.c              |  4 +-
>   drivers/pci/pci.c                          | 41 +++++++++-----
>   drivers/pci/pci.h                          | 11 +++-
>   drivers/pci/probe.c                        | 33 +++++------
>   include/linux/bcma/bcma_driver_pci.h       |  2 +-
>   include/linux/pci.h                        |  1 +
>   include/uapi/linux/pci_regs.h              |  6 +-
>   12 files changed, 117 insertions(+), 97 deletions(-)
> 

Although this looks like a useful series, I'm sorry to say but this 
doesn't solve the issue that Gary and I raised.  We double checked today 
and found that reading the vendor ID works just fine at this time.

I think that we're still better off polling PCI_PM_CTRL to "wait" for D0 
after the state change from D3cold.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready
  2024-08-28 21:24 ` [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Mario Limonciello
@ 2024-08-28 21:42   ` Bjorn Helgaas
  2024-08-28 22:26     ` Mario Limonciello
  0 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2024-08-28 21:42 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: linux-pci, Ilpo Järvinen, Maciej W . Rozycki,
	Mika Westerberg, Lukas Wunner, Rafael J . Wysocki, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas, Li, Gary

On Wed, Aug 28, 2024 at 04:24:01PM -0500, Mario Limonciello wrote:
> On 8/27/2024 18:48, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <bhelgaas@google.com>
> > 
> > After a device reset, pci_dev_wait() waits for a device to become
> > completely ready by polling the PCI_COMMAND register.  The spec envisions
> > that software would instead poll for the device to stop responding to
> > config reads with Completions with Request Retry Status (RRS).
> > 
> > Polling PCI_COMMAND leads to hardware retries that are invisible to
> > software and the backoff between software retries doesn't work correctly.
> > 
> > Root Ports are not required to support the Configuration RRS Software
> > Visibility feature that prevents hardware retries and makes the RRS
> > Completions visible to software, so this series only uses it when available
> > and falls back to PCI_COMMAND polling when it's not.
> > 
> > This is completely untested and posted for comments.
> > 
> > Bjorn Helgaas (3):
> >    PCI: Wait for device readiness with Configuration RRS
> >    PCI: aardvark: Correct Configuration RRS checking
> >    PCI: Rename CRS Completion Status to RRS
> > 
> >   drivers/bcma/driver_pci_host.c             | 10 ++--
> >   drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
> >   drivers/pci/controller/pci-aardvark.c      | 64 +++++++++++-----------
> >   drivers/pci/controller/pci-xgene.c         |  6 +-
> >   drivers/pci/controller/pcie-iproc.c        | 18 +++---
> >   drivers/pci/pci-bridge-emul.c              |  4 +-
> >   drivers/pci/pci.c                          | 41 +++++++++-----
> >   drivers/pci/pci.h                          | 11 +++-
> >   drivers/pci/probe.c                        | 33 +++++------
> >   include/linux/bcma/bcma_driver_pci.h       |  2 +-
> >   include/linux/pci.h                        |  1 +
> >   include/uapi/linux/pci_regs.h              |  6 +-
> >   12 files changed, 117 insertions(+), 97 deletions(-)
> 
> Although this looks like a useful series, I'm sorry to say but this doesn't
> solve the issue that Gary and I raised.  We double checked today and found
> that reading the vendor ID works just fine at this time.

Thanks for testing that.

> I think that we're still better off polling PCI_PM_CTRL to "wait" for D0
> after the state change from D3cold.

Is there some spec justification for polling PCI_PM_CTRL?  I'm dubious
about doing that just because "it works" in this situation, unless we
have some better understanding about *why* it works and whether all
devices are supposed to work that way.

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready
  2024-08-28 21:42   ` Bjorn Helgaas
@ 2024-08-28 22:26     ` Mario Limonciello
  2024-08-29 23:12       ` Bjorn Helgaas
  0 siblings, 1 reply; 13+ messages in thread
From: Mario Limonciello @ 2024-08-28 22:26 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Ilpo Järvinen, Maciej W . Rozycki,
	Mika Westerberg, Lukas Wunner, Rafael J . Wysocki, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas, Li, Gary

On 8/28/2024 16:42, Bjorn Helgaas wrote:
> On Wed, Aug 28, 2024 at 04:24:01PM -0500, Mario Limonciello wrote:
>> On 8/27/2024 18:48, Bjorn Helgaas wrote:
>>> From: Bjorn Helgaas <bhelgaas@google.com>
>>>
>>> After a device reset, pci_dev_wait() waits for a device to become
>>> completely ready by polling the PCI_COMMAND register.  The spec envisions
>>> that software would instead poll for the device to stop responding to
>>> config reads with Completions with Request Retry Status (RRS).
>>>
>>> Polling PCI_COMMAND leads to hardware retries that are invisible to
>>> software and the backoff between software retries doesn't work correctly.
>>>
>>> Root Ports are not required to support the Configuration RRS Software
>>> Visibility feature that prevents hardware retries and makes the RRS
>>> Completions visible to software, so this series only uses it when available
>>> and falls back to PCI_COMMAND polling when it's not.
>>>
>>> This is completely untested and posted for comments.
>>>
>>> Bjorn Helgaas (3):
>>>     PCI: Wait for device readiness with Configuration RRS
>>>     PCI: aardvark: Correct Configuration RRS checking
>>>     PCI: Rename CRS Completion Status to RRS
>>>
>>>    drivers/bcma/driver_pci_host.c             | 10 ++--
>>>    drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
>>>    drivers/pci/controller/pci-aardvark.c      | 64 +++++++++++-----------
>>>    drivers/pci/controller/pci-xgene.c         |  6 +-
>>>    drivers/pci/controller/pcie-iproc.c        | 18 +++---
>>>    drivers/pci/pci-bridge-emul.c              |  4 +-
>>>    drivers/pci/pci.c                          | 41 +++++++++-----
>>>    drivers/pci/pci.h                          | 11 +++-
>>>    drivers/pci/probe.c                        | 33 +++++------
>>>    include/linux/bcma/bcma_driver_pci.h       |  2 +-
>>>    include/linux/pci.h                        |  1 +
>>>    include/uapi/linux/pci_regs.h              |  6 +-
>>>    12 files changed, 117 insertions(+), 97 deletions(-)
>>
>> Although this looks like a useful series, I'm sorry to say but this doesn't
>> solve the issue that Gary and I raised.  We double checked today and found
>> that reading the vendor ID works just fine at this time.
> 
> Thanks for testing that.

Sure.

> 
>> I think that we're still better off polling PCI_PM_CTRL to "wait" for D0
>> after the state change from D3cold.
> 
> Is there some spec justification for polling PCI_PM_CTRL?  I'm dubious
> about doing that just because "it works" in this situation, unless we
> have some better understanding about *why* it works and whether all
> devices are supposed to work that way.
> 

I mentioned this a little bit in my patch 3/5 in my submission.  The 
issue isn't "normal" D3cold exit that is fully settled down.
That takes ~6ms from measurements.  The issue is how long it takes for 
D3cold *entry* followed by *exit*.

When this issue occurs it's tied with a tight loop of runtime PM entry 
and exit happening in that short window.
That's why it can be tripped by unplugging a dock, waiting until 
~approximately autosuspend delay and plugging it back in.  If you catch 
the right timing then the USB4 router is still on it's way down to D3cold.

In terms of a way to match this problem to the spec, the closest I could 
think is PCI-PM spec.

But I do see in the PCI-PM spec that the delay for D0->D3hot should be 
10ms. In the Linux kernel implementation __pci_set_power_state() when 
called with D3cold calls pci_set_low_power_state() which does wait 10ms 
followed by using the platform to remove power.

I can't find any timing requirements for D3hot->D3cold transition though.

I would hypothesize that if we injected a longer delay on the "other 
end" for the D3cold transition entry it would solve this issue as well 
though.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready
  2024-08-28 22:26     ` Mario Limonciello
@ 2024-08-29 23:12       ` Bjorn Helgaas
  0 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2024-08-29 23:12 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: linux-pci, Ilpo Järvinen, Maciej W . Rozycki,
	Mika Westerberg, Lukas Wunner, Rafael J . Wysocki, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas, Li, Gary

On Wed, Aug 28, 2024 at 05:26:32PM -0500, Mario Limonciello wrote:
> On 8/28/2024 16:42, Bjorn Helgaas wrote:
> > On Wed, Aug 28, 2024 at 04:24:01PM -0500, Mario Limonciello wrote:
> > > On 8/27/2024 18:48, Bjorn Helgaas wrote:
> > > > From: Bjorn Helgaas <bhelgaas@google.com>
> > > > 
> > > > After a device reset, pci_dev_wait() waits for a device to become
> > > > completely ready by polling the PCI_COMMAND register.  The spec envisions
> > > > that software would instead poll for the device to stop responding to
> > > > config reads with Completions with Request Retry Status (RRS).
> > > > 
> > > > Polling PCI_COMMAND leads to hardware retries that are invisible to
> > > > software and the backoff between software retries doesn't work correctly.
> > > > 
> > > > Root Ports are not required to support the Configuration RRS Software
> > > > Visibility feature that prevents hardware retries and makes the RRS
> > > > Completions visible to software, so this series only uses it when available
> > > > and falls back to PCI_COMMAND polling when it's not.
> > > > 
> > > > This is completely untested and posted for comments.
> > > > 
> > > > Bjorn Helgaas (3):
> > > >     PCI: Wait for device readiness with Configuration RRS
> > > >     PCI: aardvark: Correct Configuration RRS checking
> > > >     PCI: Rename CRS Completion Status to RRS
> > > > 
> > > >    drivers/bcma/driver_pci_host.c             | 10 ++--
> > > >    drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
> > > >    drivers/pci/controller/pci-aardvark.c      | 64 +++++++++++-----------
> > > >    drivers/pci/controller/pci-xgene.c         |  6 +-
> > > >    drivers/pci/controller/pcie-iproc.c        | 18 +++---
> > > >    drivers/pci/pci-bridge-emul.c              |  4 +-
> > > >    drivers/pci/pci.c                          | 41 +++++++++-----
> > > >    drivers/pci/pci.h                          | 11 +++-
> > > >    drivers/pci/probe.c                        | 33 +++++------
> > > >    include/linux/bcma/bcma_driver_pci.h       |  2 +-
> > > >    include/linux/pci.h                        |  1 +
> > > >    include/uapi/linux/pci_regs.h              |  6 +-
> > > >    12 files changed, 117 insertions(+), 97 deletions(-)
> > > 
> > > Although this looks like a useful series, I'm sorry to say but
> > > this doesn't solve the issue that Gary and I raised.  We double
> > > checked today and found that reading the vendor ID works just
> > > fine at this time.
> > 
> > Thanks for testing that.
> 
> Sure.
> 
> > > I think that we're still better off polling PCI_PM_CTRL to
> > > "wait" for D0 after the state change from D3cold.
> > 
> > Is there some spec justification for polling PCI_PM_CTRL?  I'm
> > dubious about doing that just because "it works" in this
> > situation, unless we have some better understanding about *why* it
> > works and whether all devices are supposed to work that way.
> 
> I mentioned this a little bit in my patch 3/5 in my submission.  The
> issue isn't "normal" D3cold exit that is fully settled down.  That
> takes ~6ms from measurements.  The issue is how long it takes for
> D3cold *entry* followed by *exit*.

I think we should have this conversation in the context of your series
(https://lore.kernel.org/r/20240823154023.360234-1-superm1@kernel.org)
because PCI_PM_CTRL questions are more relevant there, so I'll respond
there.

> When this issue occurs it's tied with a tight loop of runtime PM
> entry and exit happening in that short window.  That's why it can be
> tripped by unplugging a dock, waiting until ~approximately
> autosuspend delay and plugging it back in.  If you catch the right
> timing then the USB4 router is still on its way down to D3cold.
> 
> In terms of a way to match this problem to the spec, the closest I
> could think is PCI-PM spec.
> 
> But I do see in the PCI-PM spec that the delay for D0->D3hot should
> be 10ms.  In the Linux kernel implementation __pci_set_power_state()
> when called with D3cold calls pci_set_low_power_state() which does
> wait 10ms followed by using the platform to remove power.
> 
> I can't find any timing requirements for D3hot->D3cold transition
> though.
> 
> I would hypothesize that if we injected a longer delay on the "other
> end" for the D3cold transition entry it would solve this issue as
> well though.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready
  2024-08-27 23:48 [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Bjorn Helgaas
                   ` (3 preceding siblings ...)
  2024-08-28 21:24 ` [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Mario Limonciello
@ 2024-09-10  0:59 ` Bjorn Helgaas
  2024-09-10 22:55 ` Bjorn Helgaas
  5 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2024-09-10  0:59 UTC (permalink / raw)
  To: linux-pci
  Cc: Ilpo Järvinen, Maciej W . Rozycki, Mika Westerberg,
	Lukas Wunner, Rafael J . Wysocki, Mario Limonciello, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas, Stanislav Spassov

On Tue, Aug 27, 2024 at 06:48:45PM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> After a device reset, pci_dev_wait() waits for a device to become
> completely ready by polling the PCI_COMMAND register.  The spec envisions
> that software would instead poll for the device to stop responding to
> config reads with Completions with Request Retry Status (RRS).
> 
> Polling PCI_COMMAND leads to hardware retries that are invisible to
> software and the backoff between software retries doesn't work correctly.
> 
> Root Ports are not required to support the Configuration RRS Software
> Visibility feature that prevents hardware retries and makes the RRS
> Completions visible to software, so this series only uses it when available
> and falls back to PCI_COMMAND polling when it's not.
> 
> This is completely untested and posted for comments.
> 
> Bjorn Helgaas (3):
>   PCI: Wait for device readiness with Configuration RRS
>   PCI: aardvark: Correct Configuration RRS checking
>   PCI: Rename CRS Completion Status to RRS
> 
>  drivers/bcma/driver_pci_host.c             | 10 ++--
>  drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
>  drivers/pci/controller/pci-aardvark.c      | 64 +++++++++++-----------
>  drivers/pci/controller/pci-xgene.c         |  6 +-
>  drivers/pci/controller/pcie-iproc.c        | 18 +++---
>  drivers/pci/pci-bridge-emul.c              |  4 +-
>  drivers/pci/pci.c                          | 41 +++++++++-----
>  drivers/pci/pci.h                          | 11 +++-
>  drivers/pci/probe.c                        | 33 +++++------
>  include/linux/bcma/bcma_driver_pci.h       |  2 +-
>  include/linux/pci.h                        |  1 +
>  include/uapi/linux/pci_regs.h              |  6 +-
>  12 files changed, 117 insertions(+), 97 deletions(-)

I'd like to merge this unchanged (except for adding credit to
Stanislav in the [1/3] commit log) for v6.12.  Let me know if you
test, review, or object.

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready
  2024-08-27 23:48 [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Bjorn Helgaas
                   ` (4 preceding siblings ...)
  2024-09-10  0:59 ` Bjorn Helgaas
@ 2024-09-10 22:55 ` Bjorn Helgaas
  5 siblings, 0 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2024-09-10 22:55 UTC (permalink / raw)
  To: linux-pci
  Cc: Ilpo Järvinen, Maciej W . Rozycki, Mika Westerberg,
	Lukas Wunner, Rafael J . Wysocki, Mario Limonciello, Duc Dang,
	Alex Williamson, linux-kernel, Bjorn Helgaas

On Tue, Aug 27, 2024 at 06:48:45PM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> After a device reset, pci_dev_wait() waits for a device to become
> completely ready by polling the PCI_COMMAND register.  The spec envisions
> that software would instead poll for the device to stop responding to
> config reads with Completions with Request Retry Status (RRS).
> 
> Polling PCI_COMMAND leads to hardware retries that are invisible to
> software and the backoff between software retries doesn't work correctly.
> 
> Root Ports are not required to support the Configuration RRS Software
> Visibility feature that prevents hardware retries and makes the RRS
> Completions visible to software, so this series only uses it when available
> and falls back to PCI_COMMAND polling when it's not.
> 
> This is completely untested and posted for comments.
> 
> Bjorn Helgaas (3):
>   PCI: Wait for device readiness with Configuration RRS
>   PCI: aardvark: Correct Configuration RRS checking
>   PCI: Rename CRS Completion Status to RRS
> 
>  drivers/bcma/driver_pci_host.c             | 10 ++--
>  drivers/pci/controller/dwc/pcie-tegra194.c | 18 +++---
>  drivers/pci/controller/pci-aardvark.c      | 64 +++++++++++-----------
>  drivers/pci/controller/pci-xgene.c         |  6 +-
>  drivers/pci/controller/pcie-iproc.c        | 18 +++---
>  drivers/pci/pci-bridge-emul.c              |  4 +-
>  drivers/pci/pci.c                          | 41 +++++++++-----
>  drivers/pci/pci.h                          | 11 +++-
>  drivers/pci/probe.c                        | 33 +++++------
>  include/linux/bcma/bcma_driver_pci.h       |  2 +-
>  include/linux/pci.h                        |  1 +
>  include/uapi/linux/pci_regs.h              |  6 +-
>  12 files changed, 117 insertions(+), 97 deletions(-)

I provisionally applied this on pci/crs for v6.12.  Please let me know
of any concerns.  I do have one internal test report, but more would
be better since I know this affects many devices.

Bjorn

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS
  2024-08-27 23:48 ` [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS Bjorn Helgaas
  2024-08-28  4:17   ` Lukas Wunner
@ 2024-09-11  0:42   ` Duc Dang
  1 sibling, 0 replies; 13+ messages in thread
From: Duc Dang @ 2024-09-11  0:42 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: linux-pci, Ilpo Järvinen, Maciej W . Rozycki,
	Mika Westerberg, Lukas Wunner, Rafael J . Wysocki,
	Mario Limonciello, Alex Williamson, linux-kernel, Bjorn Helgaas

Tested-by: Duc Dang <ducdang@google.com>

On Tue, Aug 27, 2024 at 4:57 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> From: Bjorn Helgaas <bhelgaas@google.com>
>
> After a device reset, delays are required before the device can
> successfully complete config accesses.  PCIe r6.0, sec 6.6, specifies some
> delays required before software can perform config accesses.  Devices that
> require more time after those delays may respond to config accesses with
> Configuration Request Retry Status (RRS) completions.
>
> Callers of pci_dev_wait() are responsible for delays until the device can
> respond to config accesses.  pci_dev_wait() waits any additional time until
> the device can successfully complete config accesses.
>
> Reading config space of devices that are not present or not ready typically
> returns ~0 (PCI_ERROR_RESPONSE).  Previously we polled the Command register
> until we got a value other than ~0.  This is sometimes a problem because
> Root Complex handling of RRS completions may include several retries and
> implementation-specific behavior that is invisible to software (see sec
> 2.3.2), so the exponential backoff in pci_dev_wait() may not work as
> intended.
>
> Linux enables Configuration RRS Software Visibility on all Root Ports that
> support it.  If it is enabled, read the Vendor ID instead of the Command
> register.  RRS completions cause immediate return of the 0x0001 reserved
> Vendor ID value, so the pci_dev_wait() backoff works correctly.
>
> When a read of Vendor ID eventually completes successfully by returning a
> non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be
> initialized and ready to respond to config requests.
>
> For conventional PCI devices or devices below Root Ports that don't support
> Configuration RRS Software Visibility, poll the Command register as before.
>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
>  drivers/pci/pci.c   | 41 ++++++++++++++++++++++++++++-------------
>  drivers/pci/pci.h   |  5 +++++
>  drivers/pci/probe.c |  9 +++------
>  include/linux/pci.h |  1 +
>  4 files changed, 37 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e3a49f66982d..fc2ecb7fe081 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1283,7 +1283,9 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
>  {
>         int delay = 1;
>         bool retrain = false;
> -       struct pci_dev *bridge;
> +       struct pci_dev *root, *bridge;
> +
> +       root = pcie_find_root_port(dev);
>
>         if (pci_is_pcie(dev)) {
>                 bridge = pci_upstream_bridge(dev);
> @@ -1292,16 +1294,23 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
>         }
>
>         /*
> -        * After reset, the device should not silently discard config
> -        * requests, but it may still indicate that it needs more time by
> -        * responding to them with CRS completions.  The Root Port will
> -        * generally synthesize ~0 (PCI_ERROR_RESPONSE) data to complete
> -        * the read (except when CRS SV is enabled and the read was for the
> -        * Vendor ID; in that case it synthesizes 0x0001 data).
> +        * The caller has already waited long enough after a reset that the
> +        * device should respond to config requests, but it may respond
> +        * with Request Retry Status (RRS) if it needs more time to
> +        * initialize.
>          *
> -        * Wait for the device to return a non-CRS completion.  Read the
> -        * Command register instead of Vendor ID so we don't have to
> -        * contend with the CRS SV value.
> +        * If the device is below a Root Port with Configuration RRS
> +        * Software Visibility enabled, reading the Vendor ID returns a
> +        * special data value if the device responded with RRS.  Read the
> +        * Vendor ID until we get non-RRS status.
> +        *
> +        * If there's no Root Port or Configuration RRS Software Visibility
> +        * is not enabled, the device may still respond with RRS, but
> +        * hardware may retry the config request.  If no retries receive
> +        * Successful Completion, hardware generally synthesizes ~0
> +        * (PCI_ERROR_RESPONSE) data to complete the read.  Reading Vendor
> +        * ID for VFs and non-existent devices also returns ~0, so read the
> +        * Command register until it returns something other than ~0.
>          */
>         for (;;) {
>                 u32 id;
> @@ -1311,9 +1320,15 @@ static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
>                         return -ENOTTY;
>                 }
>
> -               pci_read_config_dword(dev, PCI_COMMAND, &id);
> -               if (!PCI_POSSIBLE_ERROR(id))
> -                       break;
> +               if (root && root->config_crs_sv) {
> +                       pci_read_config_dword(dev, PCI_VENDOR_ID, &id);
> +                       if (!pci_bus_crs_vendor_id(id))
> +                               break;
> +               } else {
> +                       pci_read_config_dword(dev, PCI_COMMAND, &id);
> +                       if (!PCI_POSSIBLE_ERROR(id))
> +                               break;
> +               }
>
>                 if (delay > timeout) {
>                         pci_warn(dev, "not ready %dms after %s; giving up\n",
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 79c8398f3938..fa1997bc2667 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -139,6 +139,11 @@ bool pci_bridge_d3_possible(struct pci_dev *dev);
>  void pci_bridge_d3_update(struct pci_dev *dev);
>  int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type);
>
> +static inline bool pci_bus_crs_vendor_id(u32 l)
> +{
> +       return (l & 0xffff) == PCI_VENDOR_ID_PCI_SIG;
> +}
> +
>  static inline void pci_wakeup_event(struct pci_dev *dev)
>  {
>         /* Wait 100 ms before the system can be put into a sleep state. */
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index b14b9876c030..b1615da9eb6b 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1209,9 +1209,11 @@ static void pci_enable_crs(struct pci_dev *pdev)
>
>         /* Enable CRS Software Visibility if supported */
>         pcie_capability_read_word(pdev, PCI_EXP_RTCAP, &root_cap);
> -       if (root_cap & PCI_EXP_RTCAP_CRSVIS)
> +       if (root_cap & PCI_EXP_RTCAP_CRSVIS) {
>                 pcie_capability_set_word(pdev, PCI_EXP_RTCTL,
>                                          PCI_EXP_RTCTL_CRSSVE);
> +               pdev->config_crs_sv = 1;
> +       }
>  }
>
>  static unsigned int pci_scan_child_bus_extend(struct pci_bus *bus,
> @@ -2343,11 +2345,6 @@ struct pci_dev *pci_alloc_dev(struct pci_bus *bus)
>  }
>  EXPORT_SYMBOL(pci_alloc_dev);
>
> -static bool pci_bus_crs_vendor_id(u32 l)
> -{
> -       return (l & 0xffff) == PCI_VENDOR_ID_PCI_SIG;
> -}
> -
>  static bool pci_bus_wait_crs(struct pci_bus *bus, int devfn, u32 *l,
>                              int timeout)
>  {
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 4cf89a4b4cbc..121d8d94d6d0 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -371,6 +371,7 @@ struct pci_dev {
>                                            can be generated */
>         unsigned int    pme_poll:1;     /* Poll device's PME status bit */
>         unsigned int    pinned:1;       /* Whether this dev is pinned */
> +       unsigned int    config_crs_sv:1; /* Config CRS software visibility */
>         unsigned int    imm_ready:1;    /* Supports Immediate Readiness */
>         unsigned int    d1_support:1;   /* Low power state D1 is supported */
>         unsigned int    d2_support:1;   /* Low power state D2 is supported */
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-09-11  0:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-27 23:48 [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Bjorn Helgaas
2024-08-27 23:48 ` [RFC PATCH 1/3] PCI: Wait for device readiness with Configuration RRS Bjorn Helgaas
2024-08-28  4:17   ` Lukas Wunner
2024-08-28 20:53     ` Bjorn Helgaas
2024-09-11  0:42   ` Duc Dang
2024-08-27 23:48 ` [RFC PATCH 2/3] PCI: aardvark: Correct Configuration RRS checking Bjorn Helgaas
2024-08-27 23:48 ` [RFC PATCH 3/3] PCI: Rename CRS Completion Status to RRS Bjorn Helgaas
2024-08-28 21:24 ` [RFC PATCH 0/3] PCI: Use Configuration RRS to wait for device ready Mario Limonciello
2024-08-28 21:42   ` Bjorn Helgaas
2024-08-28 22:26     ` Mario Limonciello
2024-08-29 23:12       ` Bjorn Helgaas
2024-09-10  0:59 ` Bjorn Helgaas
2024-09-10 22:55 ` Bjorn Helgaas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox