All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] PCI: qcom-ep: Fix runtime PM reference leak in probe error paths
@ 2026-06-24 13:49 Wentao Liang
  2026-06-24 14:04 ` sashiko-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Wentao Liang @ 2026-06-24 13:49 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas
  Cc: linux-arm-msm, linux-pci, linux-kernel, stable, Wentao Liang

The qcom_pcie_ep_probe() function obtains a runtime PM reference via
pm_runtime_get_noresume() at the beginning of the probe, but this
reference is only released on the successful completion path using
pm_runtime_put_sync().

However, the error paths fail to release this reference. The devres
cleanup registered by devm_pm_runtime_enable() only calls
pm_runtime_disable() and does not decrement the usage_count. As a
result, if any error occurs during probe (e.g., resource acquisition
failure or endpoint initialization failure), the reference is leaked,
permanently elevating the device's usage_count. This prevents proper
runtime suspend and clean device removal.

Fix this by properly distinguishing error paths that occur before and
after devm_pm_runtime_enable() succeeds:

- For errors before devm_pm_runtime_enable() succeeds: call
  pm_runtime_put_noidle() and pm_runtime_disable() manually to clean
  up the initial reference and disable runtime PM.

- For errors after devm_pm_runtime_enable() succeeds: only call
  pm_runtime_put_sync() to drop the initial reference. The
  pm_runtime_disable() is left to the devres cleanup handler,
  avoiding a double-disable.

- On the successful probe path: call pm_runtime_put_sync() to drop
  the initial reference. The return value is not checked because
  -EAGAIN/-EBUSY only indicate that the device cannot be suspended
  at this moment and do not represent a probe failure. Checking
  the return value would cause a double-put if it fails.

This ensures the runtime PM reference is correctly released on all
error paths without introducing double-disables or double-puts.

Cc: stable@vger.kernel.org
Fixes: 5b026a9e714d ("PCI: qcom-ep: Add support for firmware-managed PCIe Endpoint")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
---
 drivers/pci/controller/dwc/pcie-qcom-ep.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
index 257c2bcb5f76..4fb21ef50cd0 100644
--- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
+++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
@@ -892,16 +892,16 @@ static int qcom_pcie_ep_probe(struct platform_device *pdev)
 	pm_runtime_set_active(dev);
 	ret = devm_pm_runtime_enable(dev);
 	if (ret)
-		return ret;
+		goto err_manual_cleanup;
 
 	ret = qcom_pcie_ep_get_resources(pdev, pcie_ep);
 	if (ret)
-		return ret;
+		goto err_put_ref;
 
 	ret = dw_pcie_ep_init(&pcie_ep->pci.ep);
 	if (ret) {
 		dev_err(dev, "Failed to initialize endpoint: %d\n", ret);
-		return ret;
+		goto err_put_ref;
 	}
 
 	ret = qcom_pcie_ep_enable_irq_resources(pdev, pcie_ep);
@@ -914,11 +914,7 @@ static int qcom_pcie_ep_probe(struct platform_device *pdev)
 		goto err_disable_irqs;
 	}
 
-	ret = pm_runtime_put_sync(dev);
-	if (ret < 0) {
-		dev_err(dev, "Failed to suspend device: %d\n", ret);
-		goto err_disable_irqs;
-	}
+	pm_runtime_put_sync(dev);
 
 	pcie_ep->debugfs = debugfs_create_dir(name, NULL);
 	qcom_pcie_ep_init_debugfs(pcie_ep);
@@ -932,6 +928,13 @@ static int qcom_pcie_ep_probe(struct platform_device *pdev)
 err_ep_deinit:
 	dw_pcie_ep_deinit(&pcie_ep->pci.ep);
 
+err_put_ref:
+	pm_runtime_put_sync(dev);
+	return ret;
+
+err_manual_cleanup:
+	pm_runtime_put_noidle(dev);
+	pm_runtime_disable(dev);
 	return ret;
 }
 
-- 
2.39.5 (Apple Git-154)


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v4] PCI: qcom-ep: Fix runtime PM reference leak in probe error paths
  2026-06-24 13:49 [PATCH v4] PCI: qcom-ep: Fix runtime PM reference leak in probe error paths Wentao Liang
@ 2026-06-24 14:04 ` sashiko-bot
  0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-24 14:04 UTC (permalink / raw)
  To: Wentao Liang; +Cc: linux-pci

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [High] The patch introduces a manual `pm_runtime_disable()` call on the `devm_pm_runtime_enable()` error path, which causes a double-disable and permanently breaks runtime PM for the device.

Pre-existing issues:
- [High] `qcom_pcie_ep_remove()` does not call `dw_pcie_ep_deinit()`, causing permanent memory leaks of EPC memory structures and MSI address allocations on driver unbind.
- [High] `qcom_pcie_disable_resources()` uses the asynchronous `pm_runtime_put()` instead of synchronous `pm_runtime_put_sync()`, causing the pending suspend to be cancelled during device removal.
- [High] The global IRQ handler accesses hardware registers without acquiring a runtime PM reference, risking unclocked access and kernel panics if the IRQ fires while the device is suspended.
--

commit bfc2657160a90bca55dc8326794de5109d7591fc
Author: Wentao Liang <vulab@iscas.ac.cn>
Subject: PCI: qcom-ep: Fix runtime PM reference leak in probe error paths

The commit modifies qcom_pcie_ep_probe() to properly release the runtime
PM reference obtained via pm_runtime_get_noresume() on error paths. It
introduces differentiated error handling to avoid double-puts or
double-disables depending on whether devm_pm_runtime_enable() succeeded.

> diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
> index 257c2bcb5f76f..aa3b7cdd513cd 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
> @@ -892,16 +892,16 @@ static int qcom_pcie_ep_probe(struct platform_device *pdev)
>  	pm_runtime_set_active(dev);
>  	ret = devm_pm_runtime_enable(dev);
>  	if (ret)
> -		return ret;
> +		goto err_manual_cleanup;

[Severity: High]
If devm_pm_runtime_enable() fails, does this cause a double-disable of
runtime PM? 

When devm_pm_runtime_enable() encounters an error (like devres memory
allocation failure), it automatically calls pm_runtime_disable() to roll
back. By jumping to err_manual_cleanup, we call pm_runtime_disable() a
second time. This would increment the device's power.disable_depth to 2,
permanently preventing runtime PM enablement on subsequent probe attempts.

[ ... ]

> +err_manual_cleanup:
> +	pm_runtime_put_noidle(dev);
> +	pm_runtime_disable(dev);
>  	return ret;
>  }

[Severity: High]
This is a pre-existing issue, but does qcom_pcie_ep_remove() fail to clean up
the EPC memory?

Looking at qcom_pcie_ep_remove() in drivers/pci/controller/dwc/pcie-qcom-ep.c,
it appears to omit the dw_pcie_ep_deinit() call. During probe,
dw_pcie_ep_init() allocates the EPC address space using kzalloc() and MSI
memory via ioremap(), which are not devres-managed. Could omitting the deinit
function permanently leak these memory regions and eventually exhaust system
memory on driver unbind?

[Severity: High]
This is a pre-existing issue, but should qcom_pcie_disable_resources() use
pm_runtime_put_sync() instead of pm_runtime_put()?

In drivers/pci/controller/dwc/pcie-qcom-ep.c:qcom_pcie_disable_resources(),
pm_runtime_put(dev) queues an asynchronous suspend. Immediately after removal,
the devres cleanup invokes pm_runtime_disable(), which calls
__pm_runtime_barrier() to synchronously cancel any pending async requests.
Does this leave the device permanently powered on and active?

[Severity: High]
This is a pre-existing issue, but is it possible for the global IRQ handler
to access unpowered hardware registers?

If the global_irq fires (e.g., a spurious interrupt) after the device is
runtime suspended, qcom_pcie_ep_global_irq_thread() accesses the PARF
registers without calling pm_runtime_get_if_active(). 

drivers/pci/controller/dwc/pcie-qcom-ep.c:qcom_pcie_ep_global_irq_thread() {
    ...
    u32 status = readl_relaxed(pcie_ep->parf + PARF_INT_ALL_STATUS);
    ...
}

Since pm_runtime_put_sync() at the end of probe drops the PM reference, GenPD
may power down the domain. Would accessing unpowered registers here cause a
synchronous external abort and crash the kernel?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260624134932.44357-1-vulab@iscas.ac.cn?part=1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-24 14:04 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24 13:49 [PATCH v4] PCI: qcom-ep: Fix runtime PM reference leak in probe error paths Wentao Liang
2026-06-24 14:04 ` sashiko-bot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.