Linux PCI subsystem development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Manivannan Sadhasivam via B4 Relay"
	<devnull+manivannan.sadhasivam.oss.qualcomm.com@kernel.org>
Cc: linux-pci@vger.kernel.org
Subject: Re: [PATCH v8 4/5] PCI: qcom: Add support for resetting the Root Port due to link down event
Date: Mon, 18 May 2026 17:38:36 +0000	[thread overview]
Message-ID: <20260518173836.C77C3C2BCB7@smtp.kernel.org> (raw)
In-Reply-To: <20260518-pci-port-reset-v8-4-eb5a7d331dfc@oss.qualcomm.com>

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
- [Critical] System crash on error path due to unclocked controller.
- [High] Unintended MSI interception by the global IRQ handler causes severe performance regressions or silently drops events.
- [High] Data race and use-after-free via lockless iteration of PCI bus devices in IRQ thread.
- [Medium] Ignored return value masks link retraining failures.
--

commit f4a6523af4f70d1a2c5951a7190c4516cab3c8fc
Author: Manivannan Sadhasivam <mani@kernel.org>

PCI: qcom: Add support for resetting the Root Port due to link down event

This commit detects the PCIe link down event and resets the Root Port
by first resetting the PCIe core, reinitializing resources, and restarting
the link to make it operational again.

> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> index af6bf5cce65be..feda8abf5f851 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom.c

[ ... ]

> @@ -1613,6 +1633,78 @@ static void qcom_pcie_icc_opp_update(struct qcom_pcie *pcie)
> +static int qcom_pcie_reset_root_port(struct pci_host_bridge *bridge,
> +				  struct pci_dev *pdev)
> +{

[ ... ]

> +	ret = dw_pcie_setup_rc(pp);
> +	if (ret)
> +		goto err_host_deinit;
> +
> +	/*
> +	 * Re-enable global IRQ events as the PARF_INT_ALL_MASK register is
> +	 * non-sticky.
> +	 */
> +	if (pcie->global_irq)
> +		writel_relaxed(PARF_INT_ALL_LINK_DOWN | PARF_INT_MSI_DEV_0_7,
> +				pcie->parf + PARF_INT_ALL_MASK);
> +
> +	qcom_pcie_start_link(pci);
> +	dw_pcie_wait_for_link(pci);
> +
> +	dev_dbg(dev, "Root Port reset completed\n");
> +
> +	return 0;

Does dw_pcie_wait_for_link() return an error code (such as -ETIMEDOUT) if the
link fails to retrain?

Since this return value is ignored and the function unconditionally returns 0,
the PCI core is falsely informed that the recovery was successful.

Could this prevent the PCI subsystem from properly disconnecting the downstream
devices if the link fails to come up, leaving zombie device structures that
cannot communicate?

> +
> +err_host_deinit:
> +	qcom_pcie_host_deinit(pp);
> +
> +	return ret;
> +}

If an error occurs earlier in this function and the code jumps to
err_host_deinit, qcom_pcie_host_deinit() will be called, which turns off the
PHY power, regulators, and PCIe clocks.

When this function returns the error code back to the PCI core, the PCI core
handles the reset failure by returning PCI_ERS_RESULT_DISCONNECT and
triggering device removal.

Since the PCI core and userspace tools might attempt to read the devices'
configuration space during unbind, and the clocks are now disabled, will these
MMIO accesses cause a synchronous external abort and a kernel panic?

Should the controller remain clocked on the error path so it can safely reject
config accesses with 0xFFFFFFFF?

> @@ -1650,6 +1742,27 @@ static void qcom_pcie_init_debugfs(struct qcom_pcie *pcie)
> +static irqreturn_t qcom_pcie_global_irq_thread(int irq, void *data)
> +{
> +	struct qcom_pcie *pcie = data;
> +	struct dw_pcie_rp *pp = &pcie->pci->pp;
> +	struct device *dev = pcie->pci->dev;
> +	struct pci_dev *port;
> +	unsigned long status = readl_relaxed(pcie->parf + PARF_INT_ALL_STATUS);
> +
> +	writel_relaxed(status, pcie->parf + PARF_INT_ALL_CLEAR);
> +

In qcom_pcie_reset_root_port() and qcom_pcie_probe(), PARF_INT_MSI_DEV_0_7
is unmasked in PARF_INT_ALL_MASK. By explicitly unmasking this bit alongside
PARF_INT_ALL_LINK_DOWN, does this cause MSI interrupts to be incorrectly
routed to this global IRQ handler?

The threaded handler reads the status and writes it to PARF_INT_ALL_CLEAR,
but ignores the MSI bits and unconditionally returns IRQ_HANDLED.

Could this lead to silent dropping of MSI interrupts if the Synopsys DesignWare
core relies on this status being preserved?

Also, for high-frequency interrupt sources, would every MSI unnecessarily wake
up qcom_pcie_global_irq_thread, causing excessive CPU overhead?

> +	if (test_and_clear_bit(INT_ALL_LINK_DOWN, &status)) {
> +		dev_dbg(dev, "Received Link down event\n");
> +		for_each_pci_bridge(port, pp->bridge->bus) {
> +			if (pci_pcie_type(port) == PCI_EXP_TYPE_ROOT_PORT)
> +				pci_host_handle_link_down(port);
> +		}
> +	}

Does iterating over the root bus devices using for_each_pci_bridge() in an
asynchronous threaded IRQ context require holding the pci_bus_sem read-write
semaphore?

Without locking, could this iteration race with concurrent PCI topology
changes like hotplug events, AER recoveries, or driver unbinds, potentially
leading to list corruption or a use-after-free if a device structure is
freed while the loop is executing?

Would it be safer to use pci_walk_bus() or explicitly acquire
down_read(&pci_bus_sem)?

> +
> +	return IRQ_HANDLED;
> +}

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260518-pci-port-reset-v8-0-eb5a7d331dfc@oss.qualcomm.com?part=4

  reply	other threads:[~2026-05-18 17:38 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-18 14:59 [PATCH v8 0/5] PCI: Add support for resetting the Root Ports in a platform specific way Manivannan Sadhasivam via B4 Relay
2026-05-18 14:59 ` [PATCH v8 1/5] PCI: dwc: ep: Clear MSI iATU mapping in dw_pcie_ep_cleanup() Manivannan Sadhasivam via B4 Relay
2026-05-18 16:04   ` sashiko-bot
2026-05-18 14:59 ` [PATCH v8 2/5] PCI/ERR: Add support for resetting the Root Ports in a platform specific way Manivannan Sadhasivam via B4 Relay
2026-05-18 16:27   ` sashiko-bot
2026-05-18 14:59 ` [PATCH v8 3/5] PCI: host-common: Add link down handling for Root Ports Manivannan Sadhasivam via B4 Relay
2026-05-18 17:05   ` sashiko-bot
2026-05-18 14:59 ` [PATCH v8 4/5] PCI: qcom: Add support for resetting the Root Port due to link down event Manivannan Sadhasivam via B4 Relay
2026-05-18 17:38   ` sashiko-bot [this message]
2026-05-18 14:59 ` [PATCH v8 5/5] misc: pci_endpoint_test: Add AER error handlers Manivannan Sadhasivam via B4 Relay
2026-05-18 17:55   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260518173836.C77C3C2BCB7@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=devnull+manivannan.sadhasivam.oss.qualcomm.com@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox