public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* (no subject)
@ 2026-04-28 18:24 Fabio M. De Francesco
  2026-04-28 18:24 ` [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled Fabio M. De Francesco
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Fabio M. De Francesco @ 2026-04-28 18:24 UTC (permalink / raw)
  To: linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Bjorn Helgaas,
	linux-kernel, linux-pci, Fabio M. De Francesco

Subject: [PATCH 0/2] PCI/CXL: Recover CXL Downstream Ports from PM Init failure

CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which a
Secondary Bus Reset, a Link Down, or Downstream Port Containment on a
CXL Downstream Port prevents Port PM Init from completing when ACS
Source Validation is enabled on the Downstream Port. The spec states
that another SBR alone does not recover the port and describes a
software recovery sequence.  

Patch 1 extends cxl_reset_bus_function(), the helper backing the cxl_bus
PCI/CXL reset method exposed to userspace via sysfs. It saves, clears,
and restores ACS Source Validation and Bus Master Enable on the CXL
Downstream Port around the SBR it issues. This keeps the userspace
cxl_bus reset path from leaving the port unable to complete PM Init.

Patch 2 adds a recovery pass during CXL enumeration. For each CXL
Downstream Port in a memdev's ancestry, the CXL core checks whether PM
Init has completed. If it has not, regardless of what caused the
failure, it invokes cxl_reset_bus_function() on the child below the port
in the hope of restoring the port to a usable state. CXL enumeration
re-runs after events that tear down and re-probe the memdev, including
DPC, AER, and Link Down, so those paths reach this recovery.

This small series is developed from an old RFC v3:
https://lore.kernel.org/linux-cxl/20260330193347.25072-1-fabio.m.de.francesco@linux.intel.com/

Fabio M. De Francesco (2):
  PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled
  cxl/core: Recover from PM Init failure via cxl_reset_bus_function()

drivers/cxl/core/pci.c        | 30 ++++++++++++++++++++
 drivers/cxl/core/port.c       | 22 +++++++++++++++
 drivers/cxl/cxlpci.h          |  3 ++
 drivers/pci/pci.c             | 52 ++++++++++++++++++++++++++++++++++-
 include/linux/pci.h           |  1 + 
 include/uapi/linux/pci_regs.h |  2 ++
 6 files changed, 109 insertions(+), 1 deletion(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled
  2026-04-28 18:24 Fabio M. De Francesco
@ 2026-04-28 18:24 ` Fabio M. De Francesco
  2026-05-01 18:36   ` Dave Jiang
  2026-04-28 18:24 ` [PATCH 2/2] cxl/core: Recover from PM Init failure via cxl_reset_bus_function() Fabio M. De Francesco
  2026-05-01 22:01 ` Dave Jiang
  2 siblings, 1 reply; 6+ messages in thread
From: Fabio M. De Francesco @ 2026-04-28 18:24 UTC (permalink / raw)
  To: linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Bjorn Helgaas,
	linux-kernel, linux-pci, Fabio M. De Francesco

CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which
issuing a Secondary Bus Reset on a CXL Downstream Port leaves the
Port Power Management Initialization Complete bit unset when the PCIe
Access Control Service (ACS) Source Validation bit (SV) is enabled on
the Downstream Port. The spec states that another SBR alone will not
facilitate recovery and shows a software recovery sequence.

Implement the sequence by extending cxl_reset_bus_function() to save,
clear, and restore ACS SV and Bus Master Enable (BME) on the Downstream
Port around the SBR with the use of helpers.

The wait inside pci_bridge_secondary_bus_reset() covers the 100 ms
referenced by the spec. The helpers return when ACS SV is not enabled on
the Downstream Port.

Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
 drivers/pci/pci.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8f7cfcc00090..047d3b4508a5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4930,10 +4930,55 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
 	return rc;
 }
 
+static void cxl_disable_acs_sv_bme(struct pci_dev *bridge, u16 *saved_cmd,
+				   u16 *saved_acs_ctrl)
+{
+	if (!bridge->acs_cap)
+		return;
+
+	pci_read_config_word(bridge, bridge->acs_cap + PCI_ACS_CTRL,
+			     saved_acs_ctrl);
+	if (!(*saved_acs_ctrl & PCI_ACS_SV))
+		return;
+
+	pci_read_config_word(bridge, PCI_COMMAND, saved_cmd);
+	if (*saved_cmd & PCI_COMMAND_MASTER)
+		pci_clear_master(bridge);
+
+	pci_write_config_word(bridge, bridge->acs_cap + PCI_ACS_CTRL,
+			      *saved_acs_ctrl & ~PCI_ACS_SV);
+}
+
+static void cxl_restore_acs_sv_bme(struct pci_dev *bridge, u16 saved_cmd,
+				   u16 saved_acs_ctrl)
+{
+	if (!bridge->acs_cap || !(saved_acs_ctrl & PCI_ACS_SV))
+		return;
+
+	pci_write_config_word(bridge, bridge->acs_cap + PCI_ACS_CTRL,
+			      saved_acs_ctrl);
+	if (saved_cmd & PCI_COMMAND_MASTER)
+		pci_set_master(bridge);
+}
+
+/**
+ * cxl_reset_bus_function - SBR for a child of a CXL downstream port
+ * @dev: child device whose upstream bridge is a CXL downstream port
+ * @probe: if true, only check whether the reset is supported
+ *
+ * Issues an SBR on @dev's parent bus. Temporarily sets the CXL Port
+ * DVSEC Unmask SBR bit across the reset. When ACS Source Validation
+ * is enabled on the bridge, also temporarily clears Bus Master Enable
+ * and ACS Source Validation, per CXL r4.0 sec 8.1.5.1.
+ *
+ * Return: 0 on success, -ENOTTY if the reset cannot be issued, or an
+ * errno from the reset path.
+ */
 static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
 {
 	struct pci_dev *bridge;
 	u16 dvsec, reg, val;
+	u16 saved_cmd = 0, saved_acs_ctrl = 0;
 	int rc;
 
 	bridge = pci_upstream_bridge(dev);
@@ -4957,6 +5002,8 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
 		return rc;
 	}
 
+	cxl_disable_acs_sv_bme(bridge, &saved_cmd, &saved_acs_ctrl);
+
 	if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR) {
 		val = reg;
 	} else {
@@ -4971,6 +5018,8 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
 		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
 				      reg);
 
+	cxl_restore_acs_sv_bme(bridge, saved_cmd, saved_acs_ctrl);
+
 	pci_dev_reset_iommu_done(dev);
 	return rc;
 }
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] cxl/core: Recover from PM Init failure via cxl_reset_bus_function()
  2026-04-28 18:24 Fabio M. De Francesco
  2026-04-28 18:24 ` [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled Fabio M. De Francesco
@ 2026-04-28 18:24 ` Fabio M. De Francesco
  2026-05-01 21:59   ` Dave Jiang
  2026-05-01 22:01 ` Dave Jiang
  2 siblings, 1 reply; 6+ messages in thread
From: Fabio M. De Francesco @ 2026-04-28 18:24 UTC (permalink / raw)
  To: linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Bjorn Helgaas,
	linux-kernel, linux-pci, Fabio M. De Francesco

CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which a
Secondary Bus Reset, a Link Down, or Downstream Port Containment on a
CXL Downstream Port prevents Port PM Init from completing when ACS
Source Validation is enabled.

During CXL enumeration, for each CXL Downstream Port in a memdev's
ancestry, check whether PM Init has completed. If it has not, invoke
cxl_reset_bus_function() which is exported for use by CXL.

Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
---
 drivers/cxl/core/pci.c        | 30 ++++++++++++++++++++++++++++++
 drivers/cxl/core/port.c       | 22 ++++++++++++++++++++++
 drivers/cxl/cxlpci.h          |  3 +++
 drivers/pci/pci.c             |  3 ++-
 include/linux/pci.h           |  1 +
 include/uapi/linux/pci_regs.h |  2 ++
 6 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index d1f487b3d809..de6a317df650 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -926,3 +926,33 @@ int cxl_port_get_possible_dports(struct cxl_port *port)
 
 	return ctx.count;
 }
+
+/**
+ * cxl_port_pm_init_is_complete - check the downstream port's PM Init Complete
+ * @pdev: downstream port
+ *
+ * Read the Port Power Management Initialization Complete bit in the
+ * Downstream Port's CXL DVSEC Port Extended Status register.
+ *
+ * Return: false only when the bit is observably clear. Return true when PM
+ * init is complete, when @pdev is not a CXL port (no Port DVSEC), or when
+ * the status register cannot be read.
+ */
+bool cxl_port_pm_init_is_complete(struct pci_dev *pdev)
+{
+	u16 status;
+	u16 dvsec;
+	int rc;
+
+	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
+					  PCI_DVSEC_CXL_PORT);
+	if (!dvsec)
+		return true;
+
+	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_PORT_EXT_STATUS,
+				  &status);
+	if (rc || PCI_POSSIBLE_ERROR(status))
+		return true;
+
+	return !!FIELD_GET(PCI_DVSEC_CXL_PORT_EXT_STATUS_PM_INIT_COMP, status);
+}
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index c5aacd7054f1..a91841855d3b 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1825,6 +1825,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
 		if (is_cxl_host_bridge(dport_dev))
 			return 0;
 
+		/*
+		 * Check the downstream port's PM init status, and if it has
+		 * failed retry PM init according to CXL Spec. 4.0 Sect. 8.1.5.1
+		 * - Implementation Note
+		 */
+		if (dev_is_pci(dport_dev) && dev_is_pci(iter->parent)) {
+			struct pci_dev *dport_pdev = to_pci_dev(dport_dev);
+
+			if (!cxl_port_pm_init_is_complete(dport_pdev)) {
+				dev_dbg(&cxlmd->dev,
+					"PM init failed for %s, retrying PM init\n",
+					dev_name(dport_dev));
+
+				cxl_reset_bus_function(to_pci_dev(iter->parent), false);
+
+				if (!cxl_port_pm_init_is_complete(dport_pdev))
+					dev_dbg(&cxlmd->dev,
+						"PM init failed retry for %s\n",
+						dev_name(dport_dev));
+			}
+		}
+
 		uport_dev = dport_dev->parent;
 		if (!uport_dev) {
 			dev_warn(dev, "at %s no parent for dport: %s\n",
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index b826eb53cf7b..c66ff2ce82a3 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -114,4 +114,7 @@ static inline void devm_cxl_port_ras_setup(struct cxl_port *port)
 
 int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
 		       struct cxl_register_map *map);
+
+bool cxl_port_pm_init_is_complete(struct pci_dev *pdev);
+
 #endif /* __CXL_PCI_H__ */
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 047d3b4508a5..ae30da22daf4 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4974,7 +4974,7 @@ static void cxl_restore_acs_sv_bme(struct pci_dev *bridge, u16 saved_cmd,
  * Return: 0 on success, -ENOTTY if the reset cannot be issued, or an
  * errno from the reset path.
  */
-static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
+int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
 {
 	struct pci_dev *bridge;
 	u16 dvsec, reg, val;
@@ -5023,6 +5023,7 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
 	pci_dev_reset_iommu_done(dev);
 	return rc;
 }
+EXPORT_SYMBOL_NS_GPL(cxl_reset_bus_function, "CXL");
 
 void pci_dev_lock(struct pci_dev *dev)
 {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c4454583c11..1fb1360d41e8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1477,6 +1477,7 @@ int pci_probe_reset_slot(struct pci_slot *slot);
 int pci_probe_reset_bus(struct pci_bus *bus);
 int pci_reset_bus(struct pci_dev *dev);
 void pci_reset_secondary_bus(struct pci_dev *dev);
+int cxl_reset_bus_function(struct pci_dev *dev, bool probe);
 void pcibios_reset_secondary_bus(struct pci_dev *dev);
 void pci_update_resource(struct pci_dev *dev, int resno);
 int __must_check pci_assign_resource(struct pci_dev *dev, int i);
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 14f634ab9350..7e2579f89041 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -1369,6 +1369,8 @@
 
 /* CXL r4.0, 8.1.5: Extensions DVSEC for Ports */
 #define PCI_DVSEC_CXL_PORT				3
+#define  PCI_DVSEC_CXL_PORT_EXT_STATUS			0x0A
+#define   PCI_DVSEC_CXL_PORT_EXT_STATUS_PM_INIT_COMP	_BITUL(0)
 #define  PCI_DVSEC_CXL_PORT_CTL				0x0c
 #define   PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR		0x00000001
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled
  2026-04-28 18:24 ` [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled Fabio M. De Francesco
@ 2026-05-01 18:36   ` Dave Jiang
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Jiang @ 2026-05-01 18:36 UTC (permalink / raw)
  To: Fabio M. De Francesco, linux-cxl
  Cc: Davidlohr Bueso, Alison Schofield, Vishal Verma, Ira Weiny,
	Bjorn Helgaas, linux-kernel, linux-pci, Jonathan Cameron,
	Dan Williams



On 4/28/26 11:24 AM, Fabio M. De Francesco wrote:
> CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which
> issuing a Secondary Bus Reset on a CXL Downstream Port leaves the
> Port Power Management Initialization Complete bit unset when the PCIe
> Access Control Service (ACS) Source Validation bit (SV) is enabled on
> the Downstream Port. The spec states that another SBR alone will not
> facilitate recovery and shows a software recovery sequence.
> 
> Implement the sequence by extending cxl_reset_bus_function() to save,
> clear, and restore ACS SV and Bus Master Enable (BME) on the Downstream
> Port around the SBR with the use of helpers.
> 
> The wait inside pci_bridge_secondary_bus_reset() covers the 100 ms
> referenced by the spec. The helpers return when ACS SV is not enabled on
> the Downstream Port.
> 
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/pci/pci.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8f7cfcc00090..047d3b4508a5 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4930,10 +4930,55 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
>  	return rc;
>  }
>  
> +static void cxl_disable_acs_sv_bme(struct pci_dev *bridge, u16 *saved_cmd,
> +				   u16 *saved_acs_ctrl)

Maybe you can call this 'cxl_bus_reset_prep' and the restore one 'cxl_bus_reset_done'?

> +{

Should you return int and check if the two ptr passed in are valid?

> +	if (!bridge->acs_cap)
> +		return;

Returning here and the output parameters are not set. That is a problem when you try to restore with invalid values. Maybe -EOPNOTSUPP needs to be returned. Errno from this function should prevent the restore from being called.

> +
> +	pci_read_config_word(bridge, bridge->acs_cap + PCI_ACS_CTRL,
> +			     saved_acs_ctrl);
> +	if (!(*saved_acs_ctrl & PCI_ACS_SV))
> +		return;

What is the caller suppose to do if PCI_ACS_SV is not set? Should you skip restore later?

> +
> +	pci_read_config_word(bridge, PCI_COMMAND, saved_cmd);
> +	if (*saved_cmd & PCI_COMMAND_MASTER)

If all you care is master bit enabled, maybe just write back a bool 'master_en'.

> +		pci_clear_master(bridge);
> +
> +	pci_write_config_word(bridge, bridge->acs_cap + PCI_ACS_CTRL,
> +			      *saved_acs_ctrl & ~PCI_ACS_SV);
> +}
> +
> +static void cxl_restore_acs_sv_bme(struct pci_dev *bridge, u16 saved_cmd,
> +				   u16 saved_acs_ctrl)

static void cxl_bus_reset_done(struct pci_dev *bridge, bool master_en, u16 acs_ctrl)

saved_ has no relevance to the function itself.

Come to think of it, why not create a local struct and pass that in as parameter for prep and done functions? That way in the future if there are other stuff that needs to be done, this can be more versatile.

struct cxl_reset_ctx {
	u16 acs_ctrl;
	bool master_en;
	bool acs_restore;
};

This way you can also set 'acs_restore' on completion of the prep function. And in the done function, you can check and return early if it's not set. That takes away the messiness of returning errno in the prep function and then needing to determine if you need to call the prep function or not.

DJ

> +{
> +	if (!bridge->acs_cap || !(saved_acs_ctrl & PCI_ACS_SV))
> +		return;
> +
> +	pci_write_config_word(bridge, bridge->acs_cap + PCI_ACS_CTRL,
> +			      saved_acs_ctrl);
> +	if (saved_cmd & PCI_COMMAND_MASTER)
> +		pci_set_master(bridge);
> +}
> +
> +/**
> + * cxl_reset_bus_function - SBR for a child of a CXL downstream port
> + * @dev: child device whose upstream bridge is a CXL downstream port
> + * @probe: if true, only check whether the reset is supported
> + *
> + * Issues an SBR on @dev's parent bus. Temporarily sets the CXL Port
> + * DVSEC Unmask SBR bit across the reset. When ACS Source Validation
> + * is enabled on the bridge, also temporarily clears Bus Master Enable
> + * and ACS Source Validation, per CXL r4.0 sec 8.1.5.1.
> + *
> + * Return: 0 on success, -ENOTTY if the reset cannot be issued, or an
> + * errno from the reset path.
> + */
>  static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
>  {
>  	struct pci_dev *bridge;
>  	u16 dvsec, reg, val;
> +	u16 saved_cmd = 0, saved_acs_ctrl = 0;
>  	int rc;
>  
>  	bridge = pci_upstream_bridge(dev);
> @@ -4957,6 +5002,8 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
>  		return rc;
>  	}
>  
> +	cxl_disable_acs_sv_bme(bridge, &saved_cmd, &saved_acs_ctrl);
> +
>  	if (reg & PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR) {
>  		val = reg;
>  	} else {
> @@ -4971,6 +5018,8 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
>  		pci_write_config_word(bridge, dvsec + PCI_DVSEC_CXL_PORT_CTL,
>  				      reg);
>  
> +	cxl_restore_acs_sv_bme(bridge, saved_cmd, saved_acs_ctrl);
> +
>  	pci_dev_reset_iommu_done(dev);
>  	return rc;
>  }


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] cxl/core: Recover from PM Init failure via cxl_reset_bus_function()
  2026-04-28 18:24 ` [PATCH 2/2] cxl/core: Recover from PM Init failure via cxl_reset_bus_function() Fabio M. De Francesco
@ 2026-05-01 21:59   ` Dave Jiang
  0 siblings, 0 replies; 6+ messages in thread
From: Dave Jiang @ 2026-05-01 21:59 UTC (permalink / raw)
  To: Fabio M. De Francesco, linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Bjorn Helgaas, linux-kernel, linux-pci



On 4/28/26 11:24 AM, Fabio M. De Francesco wrote:
> CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which a
> Secondary Bus Reset, a Link Down, or Downstream Port Containment on a
> CXL Downstream Port prevents Port PM Init from completing when ACS
> Source Validation is enabled.
> 
> During CXL enumeration, for each CXL Downstream Port in a memdev's
> ancestry, check whether PM Init has completed. If it has not, invoke
> cxl_reset_bus_function() which is exported for use by CXL.
> 
> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
> ---
>  drivers/cxl/core/pci.c        | 30 ++++++++++++++++++++++++++++++
>  drivers/cxl/core/port.c       | 22 ++++++++++++++++++++++
>  drivers/cxl/cxlpci.h          |  3 +++
>  drivers/pci/pci.c             |  3 ++-
>  include/linux/pci.h           |  1 +
>  include/uapi/linux/pci_regs.h |  2 ++
>  6 files changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index d1f487b3d809..de6a317df650 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -926,3 +926,33 @@ int cxl_port_get_possible_dports(struct cxl_port *port)
>  
>  	return ctx.count;
>  }
> +
> +/**
> + * cxl_port_pm_init_is_complete - check the downstream port's PM Init Complete
> + * @pdev: downstream port
> + *
> + * Read the Port Power Management Initialization Complete bit in the
> + * Downstream Port's CXL DVSEC Port Extended Status register.
> + *
> + * Return: false only when the bit is observably clear. Return true when PM
> + * init is complete, when @pdev is not a CXL port (no Port DVSEC), or when
> + * the status register cannot be read.
> + */
> +bool cxl_port_pm_init_is_complete(struct pci_dev *pdev)
> +{
> +	u16 status;
> +	u16 dvsec;
> +	int rc;
> +
> +	dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
> +					  PCI_DVSEC_CXL_PORT);
> +	if (!dvsec)
> +		return true;
> +
> +	rc = pci_read_config_word(pdev, dvsec + PCI_DVSEC_CXL_PORT_EXT_STATUS,
> +				  &status);
> +	if (rc || PCI_POSSIBLE_ERROR(status))
> +		return true;
> +
> +	return !!FIELD_GET(PCI_DVSEC_CXL_PORT_EXT_STATUS_PM_INIT_COMP, status);
> +}
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index c5aacd7054f1..a91841855d3b 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1825,6 +1825,28 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>  		if (is_cxl_host_bridge(dport_dev))
>  			return 0;
>  
> +		/*
> +		 * Check the downstream port's PM init status, and if it has
> +		 * failed retry PM init according to CXL Spec. 4.0 Sect. 8.1.5.1
> +		 * - Implementation Note
> +		 */
> +		if (dev_is_pci(dport_dev) && dev_is_pci(iter->parent)) {
> +			struct pci_dev *dport_pdev = to_pci_dev(dport_dev);
> +
> +			if (!cxl_port_pm_init_is_complete(dport_pdev)) {
> +				dev_dbg(&cxlmd->dev,
> +					"PM init failed for %s, retrying PM init\n",
> +					dev_name(dport_dev));
> +
> +				cxl_reset_bus_function(to_pci_dev(iter->parent), false);
> +
> +				if (!cxl_port_pm_init_is_complete(dport_pdev))
> +					dev_dbg(&cxlmd->dev,
> +						"PM init failed retry for %s\n",
> +						dev_name(dport_dev));
> +			}
> +		}

Make this a helper function and move it to core/pci.c. Also pass 'struct device' and not 'struct pci_dev' to that function.

I do wonder if you reset it here, maybe you need to restart enumeration from the endpoint instead of just continuing as if nothing happened.

DJ

> +
>  		uport_dev = dport_dev->parent;
>  		if (!uport_dev) {
>  			dev_warn(dev, "at %s no parent for dport: %s\n",
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index b826eb53cf7b..c66ff2ce82a3 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -114,4 +114,7 @@ static inline void devm_cxl_port_ras_setup(struct cxl_port *port)
>  
>  int cxl_pci_setup_regs(struct pci_dev *pdev, enum cxl_regloc_type type,
>  		       struct cxl_register_map *map);
> +
> +bool cxl_port_pm_init_is_complete(struct pci_dev *pdev);
> +
>  #endif /* __CXL_PCI_H__ */
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 047d3b4508a5..ae30da22daf4 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4974,7 +4974,7 @@ static void cxl_restore_acs_sv_bme(struct pci_dev *bridge, u16 saved_cmd,
>   * Return: 0 on success, -ENOTTY if the reset cannot be issued, or an
>   * errno from the reset path.
>   */
> -static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
> +int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
>  {
>  	struct pci_dev *bridge;
>  	u16 dvsec, reg, val;
> @@ -5023,6 +5023,7 @@ static int cxl_reset_bus_function(struct pci_dev *dev, bool probe)
>  	pci_dev_reset_iommu_done(dev);
>  	return rc;
>  }
> +EXPORT_SYMBOL_NS_GPL(cxl_reset_bus_function, "CXL");
>  
>  void pci_dev_lock(struct pci_dev *dev)
>  {
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 2c4454583c11..1fb1360d41e8 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1477,6 +1477,7 @@ int pci_probe_reset_slot(struct pci_slot *slot);
>  int pci_probe_reset_bus(struct pci_bus *bus);
>  int pci_reset_bus(struct pci_dev *dev);
>  void pci_reset_secondary_bus(struct pci_dev *dev);
> +int cxl_reset_bus_function(struct pci_dev *dev, bool probe);
>  void pcibios_reset_secondary_bus(struct pci_dev *dev);
>  void pci_update_resource(struct pci_dev *dev, int resno);
>  int __must_check pci_assign_resource(struct pci_dev *dev, int i);
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index 14f634ab9350..7e2579f89041 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -1369,6 +1369,8 @@
>  
>  /* CXL r4.0, 8.1.5: Extensions DVSEC for Ports */
>  #define PCI_DVSEC_CXL_PORT				3
> +#define  PCI_DVSEC_CXL_PORT_EXT_STATUS			0x0A
> +#define   PCI_DVSEC_CXL_PORT_EXT_STATUS_PM_INIT_COMP	_BITUL(0)
>  #define  PCI_DVSEC_CXL_PORT_CTL				0x0c
>  #define   PCI_DVSEC_CXL_PORT_CTL_UNMASK_SBR		0x00000001
>  


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re:
  2026-04-28 18:24 Fabio M. De Francesco
  2026-04-28 18:24 ` [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled Fabio M. De Francesco
  2026-04-28 18:24 ` [PATCH 2/2] cxl/core: Recover from PM Init failure via cxl_reset_bus_function() Fabio M. De Francesco
@ 2026-05-01 22:01 ` Dave Jiang
  2 siblings, 0 replies; 6+ messages in thread
From: Dave Jiang @ 2026-05-01 22:01 UTC (permalink / raw)
  To: Fabio M. De Francesco, linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Bjorn Helgaas, linux-kernel, linux-pci



On 4/28/26 11:24 AM, Fabio M. De Francesco wrote:
> Subject: [PATCH 0/2] PCI/CXL: Recover CXL Downstream Ports from PM Init failure
> 
> CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which a
> Secondary Bus Reset, a Link Down, or Downstream Port Containment on a

I'm not sure if this series covers a Link Down event (i.e. hotplug). As I recall, cxl_reset_bus_function() only happens via sysfs trigger.

DJ


> CXL Downstream Port prevents Port PM Init from completing when ACS
> Source Validation is enabled on the Downstream Port. The spec states
> that another SBR alone does not recover the port and describes a
> software recovery sequence.  
> 
> Patch 1 extends cxl_reset_bus_function(), the helper backing the cxl_bus
> PCI/CXL reset method exposed to userspace via sysfs. It saves, clears,
> and restores ACS Source Validation and Bus Master Enable on the CXL
> Downstream Port around the SBR it issues. This keeps the userspace
> cxl_bus reset path from leaving the port unable to complete PM Init.
> 
> Patch 2 adds a recovery pass during CXL enumeration. For each CXL
> Downstream Port in a memdev's ancestry, the CXL core checks whether PM
> Init has completed. If it has not, regardless of what caused the
> failure, it invokes cxl_reset_bus_function() on the child below the port
> in the hope of restoring the port to a usable state. CXL enumeration
> re-runs after events that tear down and re-probe the memdev, including
> DPC, AER, and Link Down, so those paths reach this recovery.
> 
> This small series is developed from an old RFC v3:
> https://lore.kernel.org/linux-cxl/20260330193347.25072-1-fabio.m.de.francesco@linux.intel.com/
> 
> Fabio M. De Francesco (2):
>   PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled
>   cxl/core: Recover from PM Init failure via cxl_reset_bus_function()
> 
> drivers/cxl/core/pci.c        | 30 ++++++++++++++++++++
>  drivers/cxl/core/port.c       | 22 +++++++++++++++
>  drivers/cxl/cxlpci.h          |  3 ++
>  drivers/pci/pci.c             | 52 ++++++++++++++++++++++++++++++++++-
>  include/linux/pci.h           |  1 + 
>  include/uapi/linux/pci_regs.h |  2 ++
>  6 files changed, 109 insertions(+), 1 deletion(-)
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-05-01 22:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 18:24 Fabio M. De Francesco
2026-04-28 18:24 ` [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled Fabio M. De Francesco
2026-05-01 18:36   ` Dave Jiang
2026-04-28 18:24 ` [PATCH 2/2] cxl/core: Recover from PM Init failure via cxl_reset_bus_function() Fabio M. De Francesco
2026-05-01 21:59   ` Dave Jiang
2026-05-01 22:01 ` Dave Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox