* [PATCH v2 0/4] PCI: Add support for resetting the slots in a platform specific way
@ 2025-04-16 16:29 Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 1/4] PCI/ERR: Remove misleading TODO regarding kernel panic Manivannan Sadhasivam via B4 Relay
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-04-16 16:29 UTC (permalink / raw)
To: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring
Cc: dingwei, cassel, Lukas Wunner, Krishna Chaitanya Chundru,
linuxppc-dev, linux-pci, linux-kernel, linux-arm-msm,
Manivannan Sadhasivam
Hi,
Currently, in the event of AER/DPC, PCI core will try to reset the slot and its
subordinate devices by invoking bridge control reset and FLR. But in some
cases like AER Fatal error, it might be necessary to reset the slots using the
PCI host bridge drivers in a platform specific way (as indicated by the TODO in
the pcie_do_recovery() function in drivers/pci/pcie/err.c). Otherwise, the PCI
link won't be recovered successfully.
So this series adds a new callback 'pci_host_bridge::reset_slot' for the host
bridge drivers to reset the slot when a fatal error happens.
Also, this series allows the host bridge drivers to handle PCI link down event
by resetting the slots and recovering the bus. This is accomplished by the
help of a new API 'pci_host_handle_link_down()'. Host bridge drivers are
expected to call this API (preferrably from a threaded IRQ handler) when a link
down event is detected. The API will reuse the pcie_do_recovery() function to
recover the link if AER support is enabled, otherwise it will directly call the
reset_slot() callback of the host bridge driver (if exists).
For reference, I've modified the pcie-qcom driver to call
pci_host_handle_link_down() after receiving LINK_DOWN global_irq event and
populated the 'pci_host_bridge::reset_slot()' callback to reset the controller
(there by slots). Since the Qcom PCIe controllers support only a single root
port (slot) per controller instance, reset_slot() callback is going to be
invoked only once. For multi root port controllers, this callback is supposed to
identify the slots using the supplied 'pci_dev' pointer and reset them.
NOTE
====
This series is a reworked version of the earlier series [1] that I submitted for
handling PCI link down event. In this series, I've made use of the AER helpers
to recover the link as it allows notifying the device drivers and also
allows saving/restoring the config space.
Testing
=======
This series is tested on Qcom RB5 and SA8775p Ride boards by triggering the link
down event manually by writing to LTSSM register. For the error recovery to
succeed (if AER is enabled), all the drivers in the bridge hierarchy should have
the 'err_handlers' populated. Otherwise, the link recovery will fail.
[1] https://lore.kernel.org/linux-pci/20250221172309.120009-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
Changes in v2:
- Moved calling reset_slot() callback from pcie_do_recovery() to pcibios_reset_secondary_bus()
- Link to v1: https://lore.kernel.org/r/20250404-pcie-reset-slot-v1-0-98952918bf90@linaro.org
---
Manivannan Sadhasivam (4):
PCI/ERR: Remove misleading TODO regarding kernel panic
PCI/ERR: Add support for resetting the slots in a platform specific way
PCI: Add link down handling for host bridges
PCI: qcom: Add support for resetting the slot due to link down event
drivers/pci/controller/dwc/pcie-qcom.c | 89 +++++++++++++++++++++++++++++++++-
drivers/pci/pci.c | 12 +++++
drivers/pci/pci.h | 21 ++++++++
drivers/pci/pcie/err.c | 33 ++++++++++---
drivers/pci/probe.c | 7 +++
include/linux/pci.h | 2 +
6 files changed, 156 insertions(+), 8 deletions(-)
---
base-commit: 08733088b566b58283f0f12fb73f5db6a9a9de30
change-id: 20250404-pcie-reset-slot-730bfa71a202
Best regards,
--
Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/4] PCI/ERR: Remove misleading TODO regarding kernel panic
2025-04-16 16:29 [PATCH v2 0/4] PCI: Add support for resetting the slots in a platform specific way Manivannan Sadhasivam via B4 Relay
@ 2025-04-16 16:29 ` Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 2/4] PCI/ERR: Add support for resetting the slots in a platform specific way Manivannan Sadhasivam via B4 Relay
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-04-16 16:29 UTC (permalink / raw)
To: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring
Cc: dingwei, cassel, Lukas Wunner, Krishna Chaitanya Chundru,
linuxppc-dev, linux-pci, linux-kernel, linux-arm-msm,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
A PCI device is just another peripheral in a system. So failure to
recover it, must not result in a kernel panic. So remove the TODO which
is quite misleading.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
drivers/pci/pcie/err.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 31090770fffcc94e15ba6e89f649c6f84bfdf0d5..de6381c690f5c21f00021cdc7bde8d93a5c7db52 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -271,7 +271,6 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
pci_uevent_ers(bridge, PCI_ERS_RESULT_DISCONNECT);
- /* TODO: Should kernel panic here? */
pci_info(bridge, "device recovery failed\n");
return status;
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/4] PCI/ERR: Add support for resetting the slots in a platform specific way
2025-04-16 16:29 [PATCH v2 0/4] PCI: Add support for resetting the slots in a platform specific way Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 1/4] PCI/ERR: Remove misleading TODO regarding kernel panic Manivannan Sadhasivam via B4 Relay
@ 2025-04-16 16:29 ` Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 3/4] PCI: Add link down handling for host bridges Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 4/4] PCI: qcom: Add support for resetting the slot due to link down event Manivannan Sadhasivam via B4 Relay
3 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-04-16 16:29 UTC (permalink / raw)
To: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring
Cc: dingwei, cassel, Lukas Wunner, Krishna Chaitanya Chundru,
linuxppc-dev, linux-pci, linux-kernel, linux-arm-msm,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Some host bridge devices require resetting the slots in a platform specific
way to recover them from error conditions such as Fatal AER errors, Link
Down etc... So introduce pci_host_bridge::reset_slot callback and call it
from pcibios_reset_secondary_bus() if available.
The 'reset_slot' callback is responsible for resetting the given slot
referenced by the 'pci_dev' pointer in a platform specific way and bring it
back to the working state if possible. If any error occurs during the slot
reset operation, relevant errno should be returned.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
drivers/pci/pci.c | 12 ++++++++++++
drivers/pci/pcie/err.c | 5 -----
include/linux/pci.h | 1 +
3 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 4d7c9f64ea24ec754a135a2585c99489cfa641a9..13709bb898a967968540826a2b7ee8ade6b7e082 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4982,7 +4982,19 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
void __weak pcibios_reset_secondary_bus(struct pci_dev *dev)
{
+ struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
+ int ret;
+
+ if (host->reset_slot) {
+ ret = host->reset_slot(host, dev);
+ if (ret)
+ pci_err(dev, "failed to reset slot: %d\n", ret);
+
+ return;
+ }
+
pci_reset_secondary_bus(dev);
+
}
/**
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index de6381c690f5c21f00021cdc7bde8d93a5c7db52..b834fc0d705938540d3d7d3d8739770c09fe7cf1 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -234,11 +234,6 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
}
if (status == PCI_ERS_RESULT_NEED_RESET) {
- /*
- * TODO: Should call platform-specific
- * functions to reset slot before calling
- * drivers' slot_reset callbacks?
- */
status = PCI_ERS_RESULT_RECOVERED;
pci_dbg(bridge, "broadcast slot_reset message\n");
pci_walk_bridge(bridge, report_slot_reset, &status);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 0e8e3fd77e96713054388bdc82f439e51023c1bf..8d7d2a49b76cf64b4218b179cec495e0d69ddf6f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -599,6 +599,7 @@ struct pci_host_bridge {
void (*release_fn)(struct pci_host_bridge *);
int (*enable_device)(struct pci_host_bridge *bridge, struct pci_dev *dev);
void (*disable_device)(struct pci_host_bridge *bridge, struct pci_dev *dev);
+ int (*reset_slot)(struct pci_host_bridge *bridge, struct pci_dev *dev);
void *release_data;
unsigned int ignore_reset_delay:1; /* For entire hierarchy */
unsigned int no_ext_tags:1; /* No Extended Tags */
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 3/4] PCI: Add link down handling for host bridges
2025-04-16 16:29 [PATCH v2 0/4] PCI: Add support for resetting the slots in a platform specific way Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 1/4] PCI/ERR: Remove misleading TODO regarding kernel panic Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 2/4] PCI/ERR: Add support for resetting the slots in a platform specific way Manivannan Sadhasivam via B4 Relay
@ 2025-04-16 16:29 ` Manivannan Sadhasivam via B4 Relay
2025-04-16 16:57 ` Lukas Wunner
2025-04-16 17:51 ` Krishna Chaitanya Chundru
2025-04-16 16:29 ` [PATCH v2 4/4] PCI: qcom: Add support for resetting the slot due to link down event Manivannan Sadhasivam via B4 Relay
3 siblings, 2 replies; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-04-16 16:29 UTC (permalink / raw)
To: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring
Cc: dingwei, cassel, Lukas Wunner, Krishna Chaitanya Chundru,
linuxppc-dev, linux-pci, linux-kernel, linux-arm-msm,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
The PCI link, when down, needs to be recovered to bring it back. But that
cannot be done in a generic way as link recovery procedure is specific to
host bridges. So add a new API pci_host_handle_link_down() that could be
called by the host bridge drivers when the link goes down.
The API will iterate through all the slots and calls the pcie_do_recovery()
function with 'pci_channel_io_frozen' as the state. This will result in the
execution of the AER Fatal error handling code. Since the link down
recovery is pretty much the same as AER Fatal error handling,
pcie_do_recovery() helper is reused here. First the AER error_detected
callback will be triggered for the bridge and the downstream devices. Then,
pcie_do_slot_reset() will be called for each slots, which will reset the
slots using 'reset_slot' callback to recover the link. Once that's done,
resume message will be broadcasted to the bridge and the downstream devices
indicating successful link recovery.
In case if the AER support is not enabled in the kernel, only
pci_bus_error_reset() will be called for each slots as there is no way we
could inform the drivers about link recovery.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
drivers/pci/pci.h | 21 +++++++++++++++++++++
drivers/pci/pcie/err.c | 27 +++++++++++++++++++++++++++
drivers/pci/probe.c | 7 +++++++
include/linux/pci.h | 1 +
4 files changed, 56 insertions(+)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index b81e99cd4b62a3022c8b07a09f212f6888674487..6c1d4c5a82d68e5842636ff296a8d3a06487cb11 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -966,6 +966,7 @@ int pci_aer_clear_status(struct pci_dev *dev);
int pci_aer_raw_clear_status(struct pci_dev *dev);
void pci_save_aer_state(struct pci_dev *dev);
void pci_restore_aer_state(struct pci_dev *dev);
+void pcie_do_recover_slots(struct pci_host_bridge *host);
#else
static inline void pci_no_aer(void) { }
static inline void pci_aer_init(struct pci_dev *d) { }
@@ -975,6 +976,26 @@ static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
static inline void pci_save_aer_state(struct pci_dev *dev) { }
static inline void pci_restore_aer_state(struct pci_dev *dev) { }
+static inline void pcie_do_recover_slots(struct pci_host_bridge *host)
+{
+ struct pci_bus *bus = host->bus;
+ struct pci_dev *dev;
+ int ret;
+
+ if (!host->reset_slot)
+ dev_warn(&host->dev, "Missing reset_slot() callback\n");
+
+ for_each_pci_bridge(dev, bus) {
+ if (!pci_is_root_bus(bus))
+ continue;
+
+ ret = pci_bus_error_reset(dev);
+ if (ret)
+ pci_err(dev, "Failed to reset slot: %d\n", ret);
+ else
+ pci_info(dev, "Slot has been reset\n");
+ }
+}
#endif
#ifdef CONFIG_ACPI
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index b834fc0d705938540d3d7d3d8739770c09fe7cf1..70d8cd37255c5638fddf38e13ea87cb8ebe8553f 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -270,3 +270,30 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
return status;
}
+
+static pci_ers_result_t pcie_do_slot_reset(struct pci_dev *dev)
+{
+ int ret;
+
+ ret = pci_bus_error_reset(dev);
+ if (ret) {
+ pci_err(dev, "Failed to reset slot: %d\n", ret);
+ return PCI_ERS_RESULT_DISCONNECT;
+ }
+
+ pci_info(dev, "Slot has been reset\n");
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+
+void pcie_do_recover_slots(struct pci_host_bridge *host)
+{
+ struct pci_bus *bus = host->bus;
+ struct pci_dev *dev;
+
+ for_each_pci_bridge(dev, bus) {
+ if (pci_is_root_bus(bus))
+ pcie_do_recovery(dev, pci_channel_io_frozen,
+ pcie_do_slot_reset);
+ }
+}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 364fa2a514f8a68fb18bded3259c6847d3932f8b..60ad20eea0259797e68afa7979bb1fc24b6f213b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -3249,6 +3249,13 @@ int pci_host_probe(struct pci_host_bridge *bridge)
}
EXPORT_SYMBOL_GPL(pci_host_probe);
+void pci_host_handle_link_down(struct pci_host_bridge *bridge)
+{
+ dev_info(&bridge->dev, "Recovering slots due to Link Down\n");
+ pcie_do_recover_slots(bridge);
+}
+EXPORT_SYMBOL_GPL(pci_host_handle_link_down);
+
int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
{
struct resource *res = &b->busn_res;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8d7d2a49b76cf64b4218b179cec495e0d69ddf6f..76e977af2d524200b67f39a6d0417ee565cf5116 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1157,6 +1157,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
struct pci_ops *ops, void *sysdata,
struct list_head *resources);
int pci_host_probe(struct pci_host_bridge *bridge);
+void pci_host_handle_link_down(struct pci_host_bridge *bridge);
int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
void pci_bus_release_busn_res(struct pci_bus *b);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 4/4] PCI: qcom: Add support for resetting the slot due to link down event
2025-04-16 16:29 [PATCH v2 0/4] PCI: Add support for resetting the slots in a platform specific way Manivannan Sadhasivam via B4 Relay
` (2 preceding siblings ...)
2025-04-16 16:29 ` [PATCH v2 3/4] PCI: Add link down handling for host bridges Manivannan Sadhasivam via B4 Relay
@ 2025-04-16 16:29 ` Manivannan Sadhasivam via B4 Relay
3 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam via B4 Relay @ 2025-04-16 16:29 UTC (permalink / raw)
To: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring
Cc: dingwei, cassel, Lukas Wunner, Krishna Chaitanya Chundru,
linuxppc-dev, linux-pci, linux-kernel, linux-arm-msm,
Manivannan Sadhasivam
From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
The PCIe link can go down under circumstances such as the device firmware
crash, link instability, etc... When that happens, the PCIe slot needs to
be reset to make it operational again. Currently, the driver is not
handling the link down event, due to which the users have to restart the
machine to make PCIe link operational again. So fix it by detecting the
link down event and resetting the slot.
Since the Qcom PCIe controllers report the link down event through the
'global' IRQ, enable the link down event by setting PARF_INT_ALL_LINK_DOWN
bit in PARF_INT_ALL_MASK register.
Then in the case of the event, call pci_host_handle_link_down() API
in the handler to let the PCI core handle the link down condition.
The API will internally call, 'pci_host_bridge::reset_slot()' callback to
reset the slot in a platform specific way. So implement the callback to
reset the slot by first resetting the PCIe core, followed by reinitializing
the resources and then finally starting the link again.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
---
drivers/pci/controller/dwc/pcie-qcom.c | 89 +++++++++++++++++++++++++++++++++-
1 file changed, 87 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index dc98ae63362db0422384b1879a2b9a7dc564d091..b0df108fb4f3c6a8b8290062ecb3e1c5c34ddd4c 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -55,6 +55,7 @@
#define PARF_INT_ALL_STATUS 0x224
#define PARF_INT_ALL_CLEAR 0x228
#define PARF_INT_ALL_MASK 0x22c
+#define PARF_STATUS 0x230
#define PARF_SID_OFFSET 0x234
#define PARF_BDF_TRANSLATE_CFG 0x24c
#define PARF_DBI_BASE_ADDR_V2 0x350
@@ -130,8 +131,11 @@
/* PARF_LTSSM register fields */
#define LTSSM_EN BIT(8)
+#define SW_CLEAR_FLUSH_MODE BIT(10)
+#define FLUSH_MODE BIT(11)
/* PARF_INT_ALL_{STATUS/CLEAR/MASK} register fields */
+#define PARF_INT_ALL_LINK_DOWN BIT(1)
#define PARF_INT_ALL_LINK_UP BIT(13)
#define PARF_INT_MSI_DEV_0_7 GENMASK(30, 23)
@@ -145,6 +149,9 @@
/* PARF_BDF_TO_SID_CFG fields */
#define BDF_TO_SID_BYPASS BIT(0)
+/* PARF_STATUS fields */
+#define FLUSH_COMPLETED BIT(8)
+
/* ELBI_SYS_CTRL register fields */
#define ELBI_SYS_CTRL_LT_ENABLE BIT(0)
@@ -169,6 +176,7 @@
PCIE_CAP_SLOT_POWER_LIMIT_SCALE)
#define PERST_DELAY_US 1000
+#define FLUSH_TIMEOUT_US 100
#define QCOM_PCIE_CRC8_POLYNOMIAL (BIT(2) | BIT(1) | BIT(0))
@@ -274,11 +282,14 @@ struct qcom_pcie {
struct icc_path *icc_cpu;
const struct qcom_pcie_cfg *cfg;
struct dentry *debugfs;
+ int global_irq;
bool suspended;
bool use_pm_opp;
};
#define to_qcom_pcie(x) dev_get_drvdata((x)->dev)
+static int qcom_pcie_reset_slot(struct pci_host_bridge *bridge,
+ struct pci_dev *pdev);
static void qcom_ep_reset_assert(struct qcom_pcie *pcie)
{
@@ -1263,6 +1274,8 @@ static int qcom_pcie_host_init(struct dw_pcie_rp *pp)
goto err_assert_reset;
}
+ pp->bridge->reset_slot = qcom_pcie_reset_slot;
+
return 0;
err_assert_reset:
@@ -1300,6 +1313,73 @@ static const struct dw_pcie_host_ops qcom_pcie_dw_ops = {
.post_init = qcom_pcie_host_post_init,
};
+static int qcom_pcie_reset_slot(struct pci_host_bridge *bridge,
+ struct pci_dev *pdev)
+{
+ struct pci_bus *bus = bridge->bus;
+ struct dw_pcie_rp *pp = bus->sysdata;
+ struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
+ struct qcom_pcie *pcie = to_qcom_pcie(pci);
+ struct device *dev = pcie->pci->dev;
+ u32 val;
+ int ret;
+
+ /* Wait for the pending transactions to be completed */
+ ret = readl_relaxed_poll_timeout(pcie->parf + PARF_STATUS, val,
+ val & FLUSH_COMPLETED, 10,
+ FLUSH_TIMEOUT_US);
+ if (ret) {
+ dev_err(dev, "Flush completion failed: %d\n", ret);
+ goto err_host_deinit;
+ }
+
+ /* Clear the FLUSH_MODE to allow the core to be reset */
+ val = readl(pcie->parf + PARF_LTSSM);
+ val |= SW_CLEAR_FLUSH_MODE;
+ writel(val, pcie->parf + PARF_LTSSM);
+
+ /* Wait for the FLUSH_MODE to clear */
+ ret = readl_relaxed_poll_timeout(pcie->parf + PARF_LTSSM, val,
+ !(val & FLUSH_MODE), 10,
+ FLUSH_TIMEOUT_US);
+ if (ret) {
+ dev_err(dev, "Flush mode clear failed: %d\n", ret);
+ goto err_host_deinit;
+ }
+
+ qcom_pcie_host_deinit(pp);
+
+ ret = qcom_pcie_host_init(pp);
+ if (ret) {
+ dev_err(dev, "Host init failed\n");
+ return ret;
+ }
+
+ ret = dw_pcie_setup_rc(pp);
+ if (ret)
+ goto err_host_deinit;
+
+ /*
+ * Re-enable global IRQ events as the PARF_INT_ALL_MASK register is
+ * non-sticky.
+ */
+ if (pcie->global_irq)
+ writel_relaxed(PARF_INT_ALL_LINK_UP | PARF_INT_ALL_LINK_DOWN |
+ PARF_INT_MSI_DEV_0_7, pcie->parf + PARF_INT_ALL_MASK);
+
+ qcom_pcie_start_link(pci);
+ dw_pcie_wait_for_link(pci);
+
+ dev_dbg(dev, "Slot reset completed\n");
+
+ return 0;
+
+err_host_deinit:
+ qcom_pcie_host_deinit(pp);
+
+ return ret;
+}
+
/* Qcom IP rev.: 2.1.0 Synopsys IP rev.: 4.01a */
static const struct qcom_pcie_ops ops_2_1_0 = {
.get_resources = qcom_pcie_get_resources_2_1_0,
@@ -1571,6 +1651,9 @@ static irqreturn_t qcom_pcie_global_irq_thread(int irq, void *data)
pci_unlock_rescan_remove();
qcom_pcie_icc_opp_update(pcie);
+ } else if (FIELD_GET(PARF_INT_ALL_LINK_DOWN, status)) {
+ dev_dbg(dev, "Received Link down event\n");
+ pci_host_handle_link_down(pp->bridge);
} else {
dev_WARN_ONCE(dev, 1, "Received unknown event. INT_STATUS: 0x%08x\n",
status);
@@ -1732,8 +1815,10 @@ static int qcom_pcie_probe(struct platform_device *pdev)
goto err_host_deinit;
}
- writel_relaxed(PARF_INT_ALL_LINK_UP | PARF_INT_MSI_DEV_0_7,
- pcie->parf + PARF_INT_ALL_MASK);
+ writel_relaxed(PARF_INT_ALL_LINK_UP | PARF_INT_ALL_LINK_DOWN |
+ PARF_INT_MSI_DEV_0_7, pcie->parf + PARF_INT_ALL_MASK);
+
+ pcie->global_irq = irq;
}
qcom_pcie_icc_opp_update(pcie);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 3/4] PCI: Add link down handling for host bridges
2025-04-16 16:29 ` [PATCH v2 3/4] PCI: Add link down handling for host bridges Manivannan Sadhasivam via B4 Relay
@ 2025-04-16 16:57 ` Lukas Wunner
2025-04-17 7:42 ` Manivannan Sadhasivam
2025-04-16 17:51 ` Krishna Chaitanya Chundru
1 sibling, 1 reply; 11+ messages in thread
From: Lukas Wunner @ 2025-04-16 16:57 UTC (permalink / raw)
To: manivannan.sadhasivam
Cc: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczy??ski, Rob Herring, dingwei,
cassel, Krishna Chaitanya Chundru, linuxppc-dev, linux-pci,
linux-kernel, linux-arm-msm
On Wed, Apr 16, 2025 at 09:59:05PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -270,3 +270,30 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>
> return status;
> }
> +
> +static pci_ers_result_t pcie_do_slot_reset(struct pci_dev *dev)
> +{
> + int ret;
> +
> + ret = pci_bus_error_reset(dev);
> + if (ret) {
> + pci_err(dev, "Failed to reset slot: %d\n", ret);
> + return PCI_ERS_RESULT_DISCONNECT;
> + }
> +
> + pci_info(dev, "Slot has been reset\n");
> +
> + return PCI_ERS_RESULT_RECOVERED;
> +}
> +
> +void pcie_do_recover_slots(struct pci_host_bridge *host)
> +{
> + struct pci_bus *bus = host->bus;
> + struct pci_dev *dev;
> +
> + for_each_pci_bridge(dev, bus) {
> + if (pci_is_root_bus(bus))
> + pcie_do_recovery(dev, pci_channel_io_frozen,
> + pcie_do_slot_reset);
> + }
> +}
Since pcie_do_slot_reset(), pcie_do_recover_slots() and
pcie_do_recover_slots() are only needed on platforms with a
specific host controller (and not, say, on x86), it would be good
if they could be kept e.g. in a library in drivers/pci/controller/
to avoid unnecessarily enlarging the .text section for everyone else.
One option would be the existing pci-host-common.c, another a
completely new file.
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -966,6 +966,7 @@ int pci_aer_clear_status(struct pci_dev *dev);
> int pci_aer_raw_clear_status(struct pci_dev *dev);
> void pci_save_aer_state(struct pci_dev *dev);
> void pci_restore_aer_state(struct pci_dev *dev);
> +void pcie_do_recover_slots(struct pci_host_bridge *host);
> #else
> static inline void pci_no_aer(void) { }
> static inline void pci_aer_init(struct pci_dev *d) { }
> @@ -975,6 +976,26 @@ static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
> static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
> static inline void pci_save_aer_state(struct pci_dev *dev) { }
> static inline void pci_restore_aer_state(struct pci_dev *dev) { }
> +static inline void pcie_do_recover_slots(struct pci_host_bridge *host)
> +{
> + struct pci_bus *bus = host->bus;
> + struct pci_dev *dev;
> + int ret;
> +
> + if (!host->reset_slot)
> + dev_warn(&host->dev, "Missing reset_slot() callback\n");
> +
> + for_each_pci_bridge(dev, bus) {
> + if (!pci_is_root_bus(bus))
> + continue;
> +
> + ret = pci_bus_error_reset(dev);
> + if (ret)
> + pci_err(dev, "Failed to reset slot: %d\n", ret);
> + else
> + pci_info(dev, "Slot has been reset\n");
> + }
> +}
> #endif
Unusual to have such a large inline function in a header.
Can this likewise be moved to some library file and separated
from the other version via #ifdef please?
Thanks,
Lukas
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 3/4] PCI: Add link down handling for host bridges
2025-04-16 16:29 ` [PATCH v2 3/4] PCI: Add link down handling for host bridges Manivannan Sadhasivam via B4 Relay
2025-04-16 16:57 ` Lukas Wunner
@ 2025-04-16 17:51 ` Krishna Chaitanya Chundru
2025-04-17 7:54 ` Manivannan Sadhasivam
1 sibling, 1 reply; 11+ messages in thread
From: Krishna Chaitanya Chundru @ 2025-04-16 17:51 UTC (permalink / raw)
To: manivannan.sadhasivam, Mahesh J Salgaonkar, Oliver O'Halloran,
Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
Rob Herring
Cc: dingwei, cassel, Lukas Wunner, linuxppc-dev, linux-pci,
linux-kernel, linux-arm-msm
On 4/16/2025 9:59 PM, Manivannan Sadhasivam via B4 Relay wrote:
> From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
>
> The PCI link, when down, needs to be recovered to bring it back. But that
> cannot be done in a generic way as link recovery procedure is specific to
> host bridges. So add a new API pci_host_handle_link_down() that could be
> called by the host bridge drivers when the link goes down.
>
> The API will iterate through all the slots and calls the pcie_do_recovery()
> function with 'pci_channel_io_frozen' as the state. This will result in the
> execution of the AER Fatal error handling code. Since the link down
> recovery is pretty much the same as AER Fatal error handling,
> pcie_do_recovery() helper is reused here. First the AER error_detected
> callback will be triggered for the bridge and the downstream devices. Then,
> pcie_do_slot_reset() will be called for each slots, which will reset the
> slots using 'reset_slot' callback to recover the link. Once that's done,
> resume message will be broadcasted to the bridge and the downstream devices
> indicating successful link recovery.
>
> In case if the AER support is not enabled in the kernel, only
> pci_bus_error_reset() will be called for each slots as there is no way we
> could inform the drivers about link recovery.
>
The PCIe endpoint drivers are registering with err_handlers and they
will be invoked only from pcie_do_recovery, but there are getting built
by default irrespective of AER is enabled or not.
Does it make sense to built err.c irrespective of AER is enabled or not
to use common logic without the need of having dependency on AER.
Also since err.c is tied with AER, DPC also had a hard requirement
to enable AER which is not needed technically.
- Krishna Chaitanya.
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> ---
> drivers/pci/pci.h | 21 +++++++++++++++++++++
> drivers/pci/pcie/err.c | 27 +++++++++++++++++++++++++++
> drivers/pci/probe.c | 7 +++++++
> include/linux/pci.h | 1 +
> 4 files changed, 56 insertions(+)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index b81e99cd4b62a3022c8b07a09f212f6888674487..6c1d4c5a82d68e5842636ff296a8d3a06487cb11 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -966,6 +966,7 @@ int pci_aer_clear_status(struct pci_dev *dev);
> int pci_aer_raw_clear_status(struct pci_dev *dev);
> void pci_save_aer_state(struct pci_dev *dev);
> void pci_restore_aer_state(struct pci_dev *dev);
> +void pcie_do_recover_slots(struct pci_host_bridge *host);
> #else
> static inline void pci_no_aer(void) { }
> static inline void pci_aer_init(struct pci_dev *d) { }
> @@ -975,6 +976,26 @@ static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
> static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
> static inline void pci_save_aer_state(struct pci_dev *dev) { }
> static inline void pci_restore_aer_state(struct pci_dev *dev) { }
> +static inline void pcie_do_recover_slots(struct pci_host_bridge *host)
> +{
> + struct pci_bus *bus = host->bus;
> + struct pci_dev *dev;
> + int ret;
> +
> + if (!host->reset_slot)
> + dev_warn(&host->dev, "Missing reset_slot() callback\n");
> +
> + for_each_pci_bridge(dev, bus) {
> + if (!pci_is_root_bus(bus))
> + continue;
> +
> + ret = pci_bus_error_reset(dev);
> + if (ret)
> + pci_err(dev, "Failed to reset slot: %d\n", ret);
> + else
> + pci_info(dev, "Slot has been reset\n");
> + }
> +}
> #endif
>
> #ifdef CONFIG_ACPI
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index b834fc0d705938540d3d7d3d8739770c09fe7cf1..70d8cd37255c5638fddf38e13ea87cb8ebe8553f 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -270,3 +270,30 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
>
> return status;
> }
> +
> +static pci_ers_result_t pcie_do_slot_reset(struct pci_dev *dev)
> +{
> + int ret;
> +
> + ret = pci_bus_error_reset(dev);
> + if (ret) {
> + pci_err(dev, "Failed to reset slot: %d\n", ret);
> + return PCI_ERS_RESULT_DISCONNECT;
> + }
> +
> + pci_info(dev, "Slot has been reset\n");
> +
> + return PCI_ERS_RESULT_RECOVERED;
> +}
> +
> +void pcie_do_recover_slots(struct pci_host_bridge *host)
> +{
> + struct pci_bus *bus = host->bus;
> + struct pci_dev *dev;
> +
> + for_each_pci_bridge(dev, bus) {
> + if (pci_is_root_bus(bus))
> + pcie_do_recovery(dev, pci_channel_io_frozen,
> + pcie_do_slot_reset);
> + }
> +}
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 364fa2a514f8a68fb18bded3259c6847d3932f8b..60ad20eea0259797e68afa7979bb1fc24b6f213b 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -3249,6 +3249,13 @@ int pci_host_probe(struct pci_host_bridge *bridge)
> }
> EXPORT_SYMBOL_GPL(pci_host_probe);
>
> +void pci_host_handle_link_down(struct pci_host_bridge *bridge)
> +{
> + dev_info(&bridge->dev, "Recovering slots due to Link Down\n");
> + pcie_do_recover_slots(bridge);
> +}
> +EXPORT_SYMBOL_GPL(pci_host_handle_link_down);
> +
> int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int bus_max)
> {
> struct resource *res = &b->busn_res;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 8d7d2a49b76cf64b4218b179cec495e0d69ddf6f..76e977af2d524200b67f39a6d0417ee565cf5116 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1157,6 +1157,7 @@ struct pci_bus *pci_create_root_bus(struct device *parent, int bus,
> struct pci_ops *ops, void *sysdata,
> struct list_head *resources);
> int pci_host_probe(struct pci_host_bridge *bridge);
> +void pci_host_handle_link_down(struct pci_host_bridge *bridge);
> int pci_bus_insert_busn_res(struct pci_bus *b, int bus, int busmax);
> int pci_bus_update_busn_res_end(struct pci_bus *b, int busmax);
> void pci_bus_release_busn_res(struct pci_bus *b);
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 3/4] PCI: Add link down handling for host bridges
2025-04-16 16:57 ` Lukas Wunner
@ 2025-04-17 7:42 ` Manivannan Sadhasivam
0 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam @ 2025-04-17 7:42 UTC (permalink / raw)
To: Lukas Wunner
Cc: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczy??ski, Rob Herring, dingwei,
cassel, Krishna Chaitanya Chundru, linuxppc-dev, linux-pci,
linux-kernel, linux-arm-msm
On Wed, Apr 16, 2025 at 06:57:11PM +0200, Lukas Wunner wrote:
> On Wed, Apr 16, 2025 at 09:59:05PM +0530, Manivannan Sadhasivam via B4 Relay wrote:
> > --- a/drivers/pci/pcie/err.c
> > +++ b/drivers/pci/pcie/err.c
> > @@ -270,3 +270,30 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> >
> > return status;
> > }
> > +
> > +static pci_ers_result_t pcie_do_slot_reset(struct pci_dev *dev)
> > +{
> > + int ret;
> > +
> > + ret = pci_bus_error_reset(dev);
> > + if (ret) {
> > + pci_err(dev, "Failed to reset slot: %d\n", ret);
> > + return PCI_ERS_RESULT_DISCONNECT;
> > + }
> > +
> > + pci_info(dev, "Slot has been reset\n");
> > +
> > + return PCI_ERS_RESULT_RECOVERED;
> > +}
> > +
> > +void pcie_do_recover_slots(struct pci_host_bridge *host)
> > +{
> > + struct pci_bus *bus = host->bus;
> > + struct pci_dev *dev;
> > +
> > + for_each_pci_bridge(dev, bus) {
> > + if (pci_is_root_bus(bus))
> > + pcie_do_recovery(dev, pci_channel_io_frozen,
> > + pcie_do_slot_reset);
> > + }
> > +}
>
> Since pcie_do_slot_reset(), pcie_do_recover_slots() and
> pcie_do_recover_slots() are only needed on platforms with a
> specific host controller (and not, say, on x86), it would be good
> if they could be kept e.g. in a library in drivers/pci/controller/
> to avoid unnecessarily enlarging the .text section for everyone else.
>
> One option would be the existing pci-host-common.c, another a
> completely new file.
>
I don't like introducing a new file, so I'll make pci-host-common as a common
library for host controller drivers and move this code there.
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -966,6 +966,7 @@ int pci_aer_clear_status(struct pci_dev *dev);
> > int pci_aer_raw_clear_status(struct pci_dev *dev);
> > void pci_save_aer_state(struct pci_dev *dev);
> > void pci_restore_aer_state(struct pci_dev *dev);
> > +void pcie_do_recover_slots(struct pci_host_bridge *host);
> > #else
> > static inline void pci_no_aer(void) { }
> > static inline void pci_aer_init(struct pci_dev *d) { }
> > @@ -975,6 +976,26 @@ static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
> > static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
> > static inline void pci_save_aer_state(struct pci_dev *dev) { }
> > static inline void pci_restore_aer_state(struct pci_dev *dev) { }
> > +static inline void pcie_do_recover_slots(struct pci_host_bridge *host)
> > +{
> > + struct pci_bus *bus = host->bus;
> > + struct pci_dev *dev;
> > + int ret;
> > +
> > + if (!host->reset_slot)
> > + dev_warn(&host->dev, "Missing reset_slot() callback\n");
> > +
> > + for_each_pci_bridge(dev, bus) {
> > + if (!pci_is_root_bus(bus))
> > + continue;
> > +
> > + ret = pci_bus_error_reset(dev);
> > + if (ret)
> > + pci_err(dev, "Failed to reset slot: %d\n", ret);
> > + else
> > + pci_info(dev, "Slot has been reset\n");
> > + }
> > +}
> > #endif
>
> Unusual to have such a large inline function in a header.
> Can this likewise be moved to some library file and separated
> from the other version via #ifdef please?
>
Sure.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 3/4] PCI: Add link down handling for host bridges
2025-04-16 17:51 ` Krishna Chaitanya Chundru
@ 2025-04-17 7:54 ` Manivannan Sadhasivam
2025-04-17 9:11 ` Krishna Chaitanya Chundru
0 siblings, 1 reply; 11+ messages in thread
From: Manivannan Sadhasivam @ 2025-04-17 7:54 UTC (permalink / raw)
To: Krishna Chaitanya Chundru
Cc: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
dingwei, cassel, Lukas Wunner, linuxppc-dev, linux-pci,
linux-kernel, linux-arm-msm
On Wed, Apr 16, 2025 at 11:21:49PM +0530, Krishna Chaitanya Chundru wrote:
>
>
> On 4/16/2025 9:59 PM, Manivannan Sadhasivam via B4 Relay wrote:
> > From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> >
> > The PCI link, when down, needs to be recovered to bring it back. But that
> > cannot be done in a generic way as link recovery procedure is specific to
> > host bridges. So add a new API pci_host_handle_link_down() that could be
> > called by the host bridge drivers when the link goes down.
> >
> > The API will iterate through all the slots and calls the pcie_do_recovery()
> > function with 'pci_channel_io_frozen' as the state. This will result in the
> > execution of the AER Fatal error handling code. Since the link down
> > recovery is pretty much the same as AER Fatal error handling,
> > pcie_do_recovery() helper is reused here. First the AER error_detected
> > callback will be triggered for the bridge and the downstream devices. Then,
> > pcie_do_slot_reset() will be called for each slots, which will reset the
> > slots using 'reset_slot' callback to recover the link. Once that's done,
> > resume message will be broadcasted to the bridge and the downstream devices
> > indicating successful link recovery.
> >
> > In case if the AER support is not enabled in the kernel, only
> > pci_bus_error_reset() will be called for each slots as there is no way we
> > could inform the drivers about link recovery.
> >
> The PCIe endpoint drivers are registering with err_handlers and they
> will be invoked only from pcie_do_recovery, but there are getting built
> by default irrespective of AER is enabled or not.
>
AER is *one* of the functionalities of an endpoint. And the endpoint could
mostly work without AER reporting (except for AER fatal/non-fatal where recovery
need to be performed by the host). So it wouldn't make sense to add AER
dependency for them.
> Does it make sense to built err.c irrespective of AER is enabled or not
> to use common logic without the need of having dependency on AER.
>
Well, yes and no. Right now, only DPC reuses the err handlers except AER. But
DPC driver itself is functional dependent on AER. So I don't think it is really
required to build err.c independent of AER. But I will try to rework the code in
the future for fixing things like 'AER' prefix added to logs and such.
> Also since err.c is tied with AER, DPC also had a hard requirement
> to enable AER which is not needed technically.
>
DPC driver is functional dependent on AER.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 3/4] PCI: Add link down handling for host bridges
2025-04-17 7:54 ` Manivannan Sadhasivam
@ 2025-04-17 9:11 ` Krishna Chaitanya Chundru
2025-04-17 9:33 ` Manivannan Sadhasivam
0 siblings, 1 reply; 11+ messages in thread
From: Krishna Chaitanya Chundru @ 2025-04-17 9:11 UTC (permalink / raw)
To: Manivannan Sadhasivam
Cc: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
dingwei, cassel, Lukas Wunner, linuxppc-dev, linux-pci,
linux-kernel, linux-arm-msm
On 4/17/2025 1:24 PM, Manivannan Sadhasivam wrote:
> On Wed, Apr 16, 2025 at 11:21:49PM +0530, Krishna Chaitanya Chundru wrote:
>>
>>
>> On 4/16/2025 9:59 PM, Manivannan Sadhasivam via B4 Relay wrote:
>>> From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
>>>
>>> The PCI link, when down, needs to be recovered to bring it back. But that
>>> cannot be done in a generic way as link recovery procedure is specific to
>>> host bridges. So add a new API pci_host_handle_link_down() that could be
>>> called by the host bridge drivers when the link goes down.
>>>
>>> The API will iterate through all the slots and calls the pcie_do_recovery()
>>> function with 'pci_channel_io_frozen' as the state. This will result in the
>>> execution of the AER Fatal error handling code. Since the link down
>>> recovery is pretty much the same as AER Fatal error handling,
>>> pcie_do_recovery() helper is reused here. First the AER error_detected
>>> callback will be triggered for the bridge and the downstream devices. Then,
>>> pcie_do_slot_reset() will be called for each slots, which will reset the
>>> slots using 'reset_slot' callback to recover the link. Once that's done,
>>> resume message will be broadcasted to the bridge and the downstream devices
>>> indicating successful link recovery.
>>>
>>> In case if the AER support is not enabled in the kernel, only
>>> pci_bus_error_reset() will be called for each slots as there is no way we
>>> could inform the drivers about link recovery.
>>>
>> The PCIe endpoint drivers are registering with err_handlers and they
>> will be invoked only from pcie_do_recovery, but there are getting built
>> by default irrespective of AER is enabled or not.
>>
>
> AER is *one* of the functionalities of an endpoint. And the endpoint could
> mostly work without AER reporting (except for AER fatal/non-fatal where recovery
> need to be performed by the host). So it wouldn't make sense to add AER
> dependency for them.
>
>> Does it make sense to built err.c irrespective of AER is enabled or not
>> to use common logic without the need of having dependency on AER.
>>
>
> Well, yes and no. Right now, only DPC reuses the err handlers except AER. But
> DPC driver itself is functional dependent on AER. So I don't think it is really
> required to build err.c independent of AER. But I will try to rework the code in
> the future for fixing things like 'AER' prefix added to logs and such.
>
Right now we have DPC & AER to use this pcie_do_recovery(), now we are
adding supporting for controller reported error (Link down) not sure if
there will be newer ways to report errors in future.
May be not in this series, in future better to de-couple err.c from
AER as err.c. As the sources of error reporting is not limited to AER
or DPC alone now.
>> Also since err.c is tied with AER, DPC also had a hard requirement
>> to enable AER which is not needed technically.
>>
>
> DPC driver is functional dependent on AER.
I got a impression by seeing below statement that DPC can work
independently.
As per spec 6 sec 6.2.11.2, DPC error signaling "A DPC-capable
Downstream Port must support ERR_COR signaling, independent of whether
it supports Advanced Error Reporting (AER) or not".
In fact it can work if AER is not enabled also, but will not have full
functionality of DPC.
- Krishna Chaitanya.
>
> - Mani
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 3/4] PCI: Add link down handling for host bridges
2025-04-17 9:11 ` Krishna Chaitanya Chundru
@ 2025-04-17 9:33 ` Manivannan Sadhasivam
0 siblings, 0 replies; 11+ messages in thread
From: Manivannan Sadhasivam @ 2025-04-17 9:33 UTC (permalink / raw)
To: Krishna Chaitanya Chundru
Cc: Mahesh J Salgaonkar, Oliver O'Halloran, Bjorn Helgaas,
Lorenzo Pieralisi, Krzysztof Wilczyński, Rob Herring,
dingwei, cassel, Lukas Wunner, linuxppc-dev, linux-pci,
linux-kernel, linux-arm-msm
On Thu, Apr 17, 2025 at 02:41:55PM +0530, Krishna Chaitanya Chundru wrote:
>
>
> On 4/17/2025 1:24 PM, Manivannan Sadhasivam wrote:
> > On Wed, Apr 16, 2025 at 11:21:49PM +0530, Krishna Chaitanya Chundru wrote:
> > >
> > >
> > > On 4/16/2025 9:59 PM, Manivannan Sadhasivam via B4 Relay wrote:
> > > > From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > > >
> > > > The PCI link, when down, needs to be recovered to bring it back. But that
> > > > cannot be done in a generic way as link recovery procedure is specific to
> > > > host bridges. So add a new API pci_host_handle_link_down() that could be
> > > > called by the host bridge drivers when the link goes down.
> > > >
> > > > The API will iterate through all the slots and calls the pcie_do_recovery()
> > > > function with 'pci_channel_io_frozen' as the state. This will result in the
> > > > execution of the AER Fatal error handling code. Since the link down
> > > > recovery is pretty much the same as AER Fatal error handling,
> > > > pcie_do_recovery() helper is reused here. First the AER error_detected
> > > > callback will be triggered for the bridge and the downstream devices. Then,
> > > > pcie_do_slot_reset() will be called for each slots, which will reset the
> > > > slots using 'reset_slot' callback to recover the link. Once that's done,
> > > > resume message will be broadcasted to the bridge and the downstream devices
> > > > indicating successful link recovery.
> > > >
> > > > In case if the AER support is not enabled in the kernel, only
> > > > pci_bus_error_reset() will be called for each slots as there is no way we
> > > > could inform the drivers about link recovery.
> > > >
> > > The PCIe endpoint drivers are registering with err_handlers and they
> > > will be invoked only from pcie_do_recovery, but there are getting built
> > > by default irrespective of AER is enabled or not.
> > >
> >
> > AER is *one* of the functionalities of an endpoint. And the endpoint could
> > mostly work without AER reporting (except for AER fatal/non-fatal where recovery
> > need to be performed by the host). So it wouldn't make sense to add AER
> > dependency for them.
> >
> > > Does it make sense to built err.c irrespective of AER is enabled or not
> > > to use common logic without the need of having dependency on AER.
> > >
> >
> > Well, yes and no. Right now, only DPC reuses the err handlers except AER. But
> > DPC driver itself is functional dependent on AER. So I don't think it is really
> > required to build err.c independent of AER. But I will try to rework the code in
> > the future for fixing things like 'AER' prefix added to logs and such.
> >
> Right now we have DPC & AER to use this pcie_do_recovery(), now we are
> adding supporting for controller reported error (Link down) not sure if
> there will be newer ways to report errors in future.
>
> May be not in this series, in future better to de-couple err.c from
> AER as err.c. As the sources of error reporting is not limited to AER
> or DPC alone now.
>
Yes, that's part of my plan.
> > > Also since err.c is tied with AER, DPC also had a hard requirement
> > > to enable AER which is not needed technically.
> > >
> >
> > DPC driver is functional dependent on AER.
> I got a impression by seeing below statement that DPC can work
> independently.
> As per spec 6 sec 6.2.11.2, DPC error signaling "A DPC-capable
> Downstream Port must support ERR_COR signaling, independent of whether
> it supports Advanced Error Reporting (AER) or not".
>
That's why I intentionally said 'DPC driver' not 'DPC'. The driver has the
dependency, not the feature.
> In fact it can work if AER is not enabled also, but will not have full
> functionality of DPC.
>
Right. That's why I said functionally dependent.
- Mani
--
மணிவண்ணன் சதாசிவம்
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-04-17 12:12 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 16:29 [PATCH v2 0/4] PCI: Add support for resetting the slots in a platform specific way Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 1/4] PCI/ERR: Remove misleading TODO regarding kernel panic Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 2/4] PCI/ERR: Add support for resetting the slots in a platform specific way Manivannan Sadhasivam via B4 Relay
2025-04-16 16:29 ` [PATCH v2 3/4] PCI: Add link down handling for host bridges Manivannan Sadhasivam via B4 Relay
2025-04-16 16:57 ` Lukas Wunner
2025-04-17 7:42 ` Manivannan Sadhasivam
2025-04-16 17:51 ` Krishna Chaitanya Chundru
2025-04-17 7:54 ` Manivannan Sadhasivam
2025-04-17 9:11 ` Krishna Chaitanya Chundru
2025-04-17 9:33 ` Manivannan Sadhasivam
2025-04-16 16:29 ` [PATCH v2 4/4] PCI: qcom: Add support for resetting the slot due to link down event Manivannan Sadhasivam via B4 Relay
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).