Linux PCI subsystem development
 help / color / mirror / Atom feed
* (no subject)
@ 2026-04-28 18:24 Fabio M. De Francesco
  2026-04-28 18:24 ` [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled Fabio M. De Francesco
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Fabio M. De Francesco @ 2026-04-28 18:24 UTC (permalink / raw)
  To: linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Bjorn Helgaas,
	linux-kernel, linux-pci, Fabio M. De Francesco

Subject: [PATCH 0/2] PCI/CXL: Recover CXL Downstream Ports from PM Init failure

CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which a
Secondary Bus Reset, a Link Down, or Downstream Port Containment on a
CXL Downstream Port prevents Port PM Init from completing when ACS
Source Validation is enabled on the Downstream Port. The spec states
that another SBR alone does not recover the port and describes a
software recovery sequence.  

Patch 1 extends cxl_reset_bus_function(), the helper backing the cxl_bus
PCI/CXL reset method exposed to userspace via sysfs. It saves, clears,
and restores ACS Source Validation and Bus Master Enable on the CXL
Downstream Port around the SBR it issues. This keeps the userspace
cxl_bus reset path from leaving the port unable to complete PM Init.

Patch 2 adds a recovery pass during CXL enumeration. For each CXL
Downstream Port in a memdev's ancestry, the CXL core checks whether PM
Init has completed. If it has not, regardless of what caused the
failure, it invokes cxl_reset_bus_function() on the child below the port
in the hope of restoring the port to a usable state. CXL enumeration
re-runs after events that tear down and re-probe the memdev, including
DPC, AER, and Link Down, so those paths reach this recovery.

This small series is developed from an old RFC v3:
https://lore.kernel.org/linux-cxl/20260330193347.25072-1-fabio.m.de.francesco@linux.intel.com/

Fabio M. De Francesco (2):
  PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled
  cxl/core: Recover from PM Init failure via cxl_reset_bus_function()

drivers/cxl/core/pci.c        | 30 ++++++++++++++++++++
 drivers/cxl/core/port.c       | 22 +++++++++++++++
 drivers/cxl/cxlpci.h          |  3 ++
 drivers/pci/pci.c             | 52 ++++++++++++++++++++++++++++++++++-
 include/linux/pci.h           |  1 + 
 include/uapi/linux/pci_regs.h |  2 ++
 6 files changed, 109 insertions(+), 1 deletion(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 13+ messages in thread
* (no subject)
@ 2025-05-09 17:38 Shawn Anastasio
  2025-05-10 19:50 ` Trilok Soni
  0 siblings, 1 reply; 13+ messages in thread
From: Shawn Anastasio @ 2025-05-09 17:38 UTC (permalink / raw)
  To: linux-pci, Lukas Wunner, Krishna Chaitanya Chundru
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, chaitanya chundru, Bjorn Andersson, Konrad Dybcio,
	cros-qcom-dts-watchers, Jingoo Han, Bartosz Golaszewski,
	quic_vbadigan, amitk, devicetree, linux-kernel, linux-arm-msm,
	jorge.ramirez, Dmitry Baryshkov, Timothy Pearson, Shawn Anastasio

From: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>

Date: Sat, 12 Apr 2025 07:19:56 +0530
Subject: [PATCH v6] PCI: PCI: Add pcie_link_is_active() to determine if the
 PCIe link is active

Introduce a common API to check if the PCIe link is active, replacing
duplicate code in multiple locations.

Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
---
This is an updated patch pulled from Krishna's v5 series:
https://patchwork.kernel.org/project/linux-pci/list/?series=952665

The following changes to Krishna's v5 were made by me:
  - Revert pcie_link_is_active return type back to int per Lukas' review
    comments
  - Handle non-zero error returns at call site of the new function in
    pci.c/pci_bridge_wait_for_secondary_bus

 drivers/pci/hotplug/pciehp.h      |  1 -
 drivers/pci/hotplug/pciehp_ctrl.c |  2 +-
 drivers/pci/hotplug/pciehp_hpc.c  | 33 +++----------------------------
 drivers/pci/pci.c                 | 26 +++++++++++++++++++++---
 include/linux/pci.h               |  4 ++++
 5 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 273dd8c66f4e..acef728530e3 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -186,7 +186,6 @@ int pciehp_query_power_fault(struct controller *ctrl);
 int pciehp_card_present(struct controller *ctrl);
 int pciehp_card_present_or_link_active(struct controller *ctrl);
 int pciehp_check_link_status(struct controller *ctrl);
-int pciehp_check_link_active(struct controller *ctrl);
 void pciehp_release_ctrl(struct controller *ctrl);

 int pciehp_sysfs_enable_slot(struct hotplug_slot *hotplug_slot);
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index d603a7aa7483..4bb58ba1c766 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -260,7 +260,7 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
 	/* Turn the slot on if it's occupied or link is up */
 	mutex_lock(&ctrl->state_lock);
 	present = pciehp_card_present(ctrl);
-	link_active = pciehp_check_link_active(ctrl);
+	link_active = pcie_link_is_active(ctrl->pcie->port);
 	if (present <= 0 && link_active <= 0) {
 		if (ctrl->state == BLINKINGON_STATE) {
 			ctrl->state = OFF_STATE;
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 8a09fb6083e2..278bc21d531d 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -221,33 +221,6 @@ static void pcie_write_cmd_nowait(struct controller *ctrl, u16 cmd, u16 mask)
 	pcie_do_write_cmd(ctrl, cmd, mask, false);
 }

-/**
- * pciehp_check_link_active() - Is the link active
- * @ctrl: PCIe hotplug controller
- *
- * Check whether the downstream link is currently active. Note it is
- * possible that the card is removed immediately after this so the
- * caller may need to take it into account.
- *
- * If the hotplug controller itself is not available anymore returns
- * %-ENODEV.
- */
-int pciehp_check_link_active(struct controller *ctrl)
-{
-	struct pci_dev *pdev = ctrl_dev(ctrl);
-	u16 lnk_status;
-	int ret;
-
-	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
-	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
-		return -ENODEV;
-
-	ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
-	ctrl_dbg(ctrl, "%s: lnk_status = %x\n", __func__, lnk_status);
-
-	return ret;
-}
-
 static bool pci_bus_check_dev(struct pci_bus *bus, int devfn)
 {
 	u32 l;
@@ -467,7 +440,7 @@ int pciehp_card_present_or_link_active(struct controller *ctrl)
 	if (ret)
 		return ret;

-	return pciehp_check_link_active(ctrl);
+	return pcie_link_is_active(ctrl_dev(ctrl));
 }

 int pciehp_query_power_fault(struct controller *ctrl)
@@ -584,7 +557,7 @@ static void pciehp_ignore_dpc_link_change(struct controller *ctrl,
 	 * Synthesize it to ensure that it is acted on.
 	 */
 	down_read_nested(&ctrl->reset_lock, ctrl->depth);
-	if (!pciehp_check_link_active(ctrl))
+	if (!pcie_link_is_active(ctrl_dev(ctrl)))
 		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
 	up_read(&ctrl->reset_lock);
 }
@@ -884,7 +857,7 @@ int pciehp_slot_reset(struct pcie_device *dev)
 	pcie_capability_write_word(dev->port, PCI_EXP_SLTSTA,
 				   PCI_EXP_SLTSTA_DLLSC);

-	if (!pciehp_check_link_active(ctrl))
+	if (!pcie_link_is_active(ctrl_dev(ctrl)))
 		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);

 	return 0;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e77d5b53c0ce..3bb8354b14bf 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4926,7 +4926,6 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
 		return 0;

 	if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
-		u16 status;

 		pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
 		msleep(delay);
@@ -4942,8 +4941,7 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
 		if (!dev->link_active_reporting)
 			return -ENOTTY;

-		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
-		if (!(status & PCI_EXP_LNKSTA_DLLLA))
+		if (pcie_link_is_active(dev) <= 0)
 			return -ENOTTY;

 		return pci_dev_wait(child, reset_type,
@@ -6247,6 +6245,28 @@ void pcie_print_link_status(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pcie_print_link_status);

+/**
+ * pcie_link_is_active() - Checks if the link is active or not
+ * @pdev: PCI device to query
+ *
+ * Check whether the link is active or not.
+ *
+ * Return: link state, or -ENODEV if the config read failes.
+ */
+int pcie_link_is_active(struct pci_dev *pdev)
+{
+	u16 lnk_status;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
+	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
+		return -ENODEV;
+
+	pci_dbg(pdev, "lnk_status = %x\n", lnk_status);
+	return !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
+}
+EXPORT_SYMBOL(pcie_link_is_active);
+
 /**
  * pci_select_bars - Make BAR mask from the type of resource
  * @dev: the PCI device for which BAR mask is made
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 51e2bd6405cd..a79a9919320c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1945,6 +1945,7 @@ pci_release_mem_regions(struct pci_dev *pdev)
 			    pci_select_bars(pdev, IORESOURCE_MEM));
 }

+int pcie_link_is_active(struct pci_dev *dev);
 #else /* CONFIG_PCI is not enabled */

 static inline void pci_set_flags(int flags) { }
@@ -2093,6 +2094,9 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
 {
 	return -ENOSPC;
 }
+
+static inline bool pcie_link_is_active(struct pci_dev *dev)
+{ return false; }
 #endif /* CONFIG_PCI */

 /* Include architecture-dependent settings and functions */
--
2.30.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread
* Re: [PATCH v3 7/8] ACPI, PCI: add hostbridge removal function
@ 2012-10-30  4:02 Bjorn Helgaas
  2012-10-30 17:42 ` Yinghai Lu
  0 siblings, 1 reply; 13+ messages in thread
From: Bjorn Helgaas @ 2012-10-30  4:02 UTC (permalink / raw)
  To: Taku Izumi; +Cc: linux-pci, linux-acpi, kaneshige.kenji, yinghai, jiang.liu

I think I'm missing patches 7/8 and 8/8 from this series.  Can
you repost them to make sure I have the latest versions?

Note the comments below:

On Fri, Sep 28, 2012 at 06:46:27PM +0900, Taku Izumi wrote:
> 
> Currently there's no PCI-related clean-up code
> in acpi_pci_root_remove() function.
> This patch introduces function for hostbridge removal,
> and brings back pci_stop_bus_devices() function.
> 
> diff: rebased against current next
>       updated according to Bjorn's comment
> 
> Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
> ---
>  drivers/acpi/pci_bind.c     |    7 +++++++
>  drivers/acpi/pci_root.c     |    6 ++++++
>  drivers/pci/remove.c        |   28 ++++++++++++++++++++++++++++
>  include/acpi/acpi_drivers.h |    1 +
>  include/linux/pci.h         |    2 ++
>  5 files changed, 44 insertions(+)
> 
> Index: Bjorn-next-0925-configchange/drivers/pci/remove.c
> ===================================================================
> --- Bjorn-next-0925-configchange.orig/drivers/pci/remove.c
> +++ Bjorn-next-0925-configchange/drivers/pci/remove.c
> @@ -111,3 +111,31 @@ void pci_stop_and_remove_bus_device(stru
>  	pci_remove_bus_device(dev);
>  }
>  EXPORT_SYMBOL(pci_stop_and_remove_bus_device);
> +
> +void pci_stop_bus_devices(struct pci_bus *bus)
> +{
> +	struct pci_dev *dev, *tmp;
> +
> +	list_for_each_entry_safe_reverse(dev, tmp,
> +					 &bus->devices, bus_list) {
> +		pci_stop_bus_device(dev);
> +	}
> +
> +}
> +EXPORT_SYMBOL(pci_stop_bus_devices);

I'm hesitant to introduce pci_stop_bus_devices() again, particularly
when it is exported.  The stop/remove split introduces the state where
devices are "stopped" but haven't been "removed" yet.

In this state, the driver .remove() method has been called, sysfs has
been cleaned up, and the struct device has been unregistered, but the 
struct pci_dev itself still exists.  Obviously, this state *must*
exist internally in the PCI core as we remove the PCI device.

The problem is that we have non-core code that *depends* on being
run while in this transitory state.  I think this is a design mistake.

The code that depends on this state is basically just the stuff in the
acpi_pci_drivers list, namely, acpi_pci_hp_driver and acpi_pci_slot_driver.
I suspect that the main reason we have the acpi_pci_drivers list and the
whole acpi_pci_register_driver() infrastructure is so that these PCI
host bridge "sub-drivers" can be built as modules.

I don't think there's really any value in having these sub-drivers as
modules, and it leads to a lot of complication in the code.  I'm pretty
sure that forcing them to be selected at build-time will let us make
things much simpler.

If we have to have pci_stop_bus_devices() as an interim measure, I can
live with it, but it doesn't need to be exported, does it?

> +void pci_remove_host_bridge(struct pci_host_bridge *bridge)
> +{
> +	struct pci_bus *root = bridge->bus;
> +	struct pci_dev *dev, *tmp;
> +
> +	list_for_each_entry_safe_reverse(dev, tmp,
> +					 &root->devices, bus_list) {
> +		pci_remove_bus_device(dev);
> +	}
> +
> +	pci_remove_bus(root);
> +
> +	device_unregister(&bridge->dev);
> +}
> +EXPORT_SYMBOL(pci_remove_host_bridge);
> Index: Bjorn-next-0925-configchange/drivers/acpi/pci_root.c
> ===================================================================
> --- Bjorn-next-0925-configchange.orig/drivers/acpi/pci_root.c
> +++ Bjorn-next-0925-configchange/drivers/acpi/pci_root.c
> @@ -659,8 +659,10 @@ static int acpi_pci_root_remove(struct a
>  {
>  	struct acpi_pci_root *root = acpi_driver_data(device);
>  	struct acpi_pci_driver *driver;
> +	struct pci_host_bridge *bridge = to_pci_host_bridge(root->bus->bridge);
>  
>  	mutex_lock(&acpi_pci_root_lock);
> +	pci_stop_bus_devices(root->bus);
>  	list_for_each_entry(driver, &acpi_pci_drivers, node)
>  		if (driver->remove)
>  			driver->remove(root);
> @@ -668,6 +670,10 @@ static int acpi_pci_root_remove(struct a
>  	device_set_run_wake(root->bus->bridge, false);
>  	pci_acpi_remove_bus_pm_notifier(device);
>  
> +	acpi_pci_unbind_root(device);
> +
> +	pci_remove_host_bridge(bridge);
> +
>  	list_del(&root->node);
>  	mutex_unlock(&acpi_pci_root_lock);
>  	kfree(root);
> Index: Bjorn-next-0925-configchange/include/linux/pci.h
> ===================================================================
> --- Bjorn-next-0925-configchange.orig/include/linux/pci.h
> +++ Bjorn-next-0925-configchange/include/linux/pci.h
> @@ -734,6 +734,8 @@ extern struct pci_dev *pci_dev_get(struc
>  extern void pci_dev_put(struct pci_dev *dev);
>  extern void pci_remove_bus(struct pci_bus *b);
>  extern void pci_stop_and_remove_bus_device(struct pci_dev *dev);
> +extern void pci_stop_bus_devices(struct pci_bus *bus);
> +extern void pci_remove_host_bridge(struct pci_host_bridge *bridge);
>  void pci_setup_cardbus(struct pci_bus *bus);
>  extern void pci_sort_breadthfirst(void);
>  #define dev_is_pci(d) ((d)->bus == &pci_bus_type)
> Index: Bjorn-next-0925-configchange/drivers/acpi/pci_bind.c
> ===================================================================
> --- Bjorn-next-0925-configchange.orig/drivers/acpi/pci_bind.c
> +++ Bjorn-next-0925-configchange/drivers/acpi/pci_bind.c
> @@ -118,3 +118,10 @@ int acpi_pci_bind_root(struct acpi_devic
>  
>  	return 0;
>  }
> +
> +void  acpi_pci_unbind_root(struct acpi_device *device)
> +{
> +	device->ops.bind = NULL;
> +	device->ops.unbind = NULL;
> +}
> +
> Index: Bjorn-next-0925-configchange/include/acpi/acpi_drivers.h
> ===================================================================
> --- Bjorn-next-0925-configchange.orig/include/acpi/acpi_drivers.h
> +++ Bjorn-next-0925-configchange/include/acpi/acpi_drivers.h
> @@ -101,6 +101,7 @@ struct pci_bus;
>  
>  struct pci_dev *acpi_get_pci_dev(acpi_handle);
>  int acpi_pci_bind_root(struct acpi_device *device);
> +void acpi_pci_unbind_root(struct acpi_device *device);
>  
>  /* Arch-defined function to add a bus to the system */
>  
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Re:
@ 2011-12-13  2:58 Matt Shaw
  0 siblings, 0 replies; 13+ messages in thread
From: Matt Shaw @ 2011-12-13  2:58 UTC (permalink / raw)
  To: doshoes1990

https://docs.google.com/document/d/1-MgXERW0_TNnd0VK2caXYhDXtT58z1DJ0laAmJaj=
rEs/edit

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2026-05-06  5:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 18:24 Fabio M. De Francesco
2026-04-28 18:24 ` [PATCH 1/2] PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled Fabio M. De Francesco
2026-05-01 18:36   ` Dave Jiang
2026-04-28 18:24 ` [PATCH 2/2] cxl/core: Recover from PM Init failure via cxl_reset_bus_function() Fabio M. De Francesco
2026-05-01 21:59   ` Dave Jiang
2026-05-06  5:54   ` Alison Schofield
2026-05-01 22:01 ` Dave Jiang
  -- strict thread matches above, loose matches on Subject: below --
2025-05-09 17:38 Shawn Anastasio
2025-05-10 19:50 ` Trilok Soni
2012-10-30  4:02 [PATCH v3 7/8] ACPI, PCI: add hostbridge removal function Bjorn Helgaas
2012-10-30 17:42 ` Yinghai Lu
2012-11-02  0:17   ` Rafael J. Wysocki
2012-11-05 22:27     ` Re: Bjorn Helgaas
2012-11-05 22:49       ` Re: Yinghai Lu
2012-11-06  5:03   ` Taku Izumi
2011-12-13  2:58 Matt Shaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox