* [PATCH] PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe()
@ 2026-06-09 10:07 Wentao Liang
2026-06-09 10:15 ` sashiko-bot
0 siblings, 1 reply; 2+ messages in thread
From: Wentao Liang @ 2026-06-09 10:07 UTC (permalink / raw)
To: mani, lpieralisi, kwilczynski, robh, bhelgaas
Cc: linux-pci, linux-arm-msm, linux-kernel, Wentao Liang, stable
qcom_pcie_ep_probe() obtains a runtime PM reference via
pm_runtime_get_noresume() but only releases it on the successful
completion of the function using pm_runtime_put_sync(). The devm
cleanup registered by devm_pm_runtime_enable() does not put the
reference - it only disables autosuspend and calls
pm_runtime_disable(), which does not decrement usage_count. Thus
if any of the probe error paths are taken (e.g. resource acquisition
or endpoint init failure) the reference is leaked, permanently
elevating the device's usage_count and preventing proper runtime
suspend or clean removal.
Add a new error label that calls pm_runtime_put_noidle() and
pm_runtime_disable() before returning the error, and convert the
early return statements to use that label. This ensures the
reference is correctly released on all error paths.
Cc: stable@vger.kernel.org
Fixes: 5b026a9e714d ("PCI: qcom-ep: Add support for firmware-managed PCIe Endpoint")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
---
drivers/pci/controller/dwc/pcie-qcom-ep.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
index 257c2bcb5f76..c01d4d3b4952 100644
--- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
+++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
@@ -892,16 +892,16 @@ static int qcom_pcie_ep_probe(struct platform_device *pdev)
pm_runtime_set_active(dev);
ret = devm_pm_runtime_enable(dev);
if (ret)
- return ret;
+ goto err_rpm_put;
ret = qcom_pcie_ep_get_resources(pdev, pcie_ep);
if (ret)
- return ret;
+ goto err_rpm_put;
ret = dw_pcie_ep_init(&pcie_ep->pci.ep);
if (ret) {
dev_err(dev, "Failed to initialize endpoint: %d\n", ret);
- return ret;
+ goto err_rpm_put;
}
ret = qcom_pcie_ep_enable_irq_resources(pdev, pcie_ep);
@@ -932,6 +932,10 @@ static int qcom_pcie_ep_probe(struct platform_device *pdev)
err_ep_deinit:
dw_pcie_ep_deinit(&pcie_ep->pci.ep);
+err_rpm_put:
+ pm_runtime_put_noidle(dev);
+ pm_runtime_disable(dev);
+
return ret;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH] PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe()
2026-06-09 10:07 [PATCH] PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe() Wentao Liang
@ 2026-06-09 10:15 ` sashiko-bot
0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-09 10:15 UTC (permalink / raw)
To: Wentao Liang; +Cc: linux-pci
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
New issues:
- [High] Calling `pm_runtime_disable()` manually in the error path after using `devm_pm_runtime_enable()` causes a double-disable, permanently breaking runtime PM.
- [High] The error path for a failed `pm_runtime_put_sync()` falls through to `err_rpm_put`, leading to a double decrement of the runtime PM usage_count.
Pre-existing issues:
- [High] Failing the entire probe function if `pm_runtime_put_sync()` returns an error is incorrect and brittle.
--
commit 2adeb10c2dada5b5e65aed93fd2d269f6cd5a790
Author: Wentao Liang <vulab@iscas.ac.cn>
PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe()
This commit attempts to fix a runtime PM reference leak in the error paths
of qcom_pcie_ep_probe() by introducing a new error label. The new label
calls pm_runtime_put_noidle() and pm_runtime_disable() to properly release
the reference before returning the error.
> diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
> index 257c2bcb5f76f..c01d4d3b49520 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
[ ... ]
> @@ -932,6 +932,10 @@ static int qcom_pcie_ep_probe(struct platform_device *pdev)
> err_ep_deinit:
> dw_pcie_ep_deinit(&pcie_ep->pci.ep);
>
> +err_rpm_put:
> + pm_runtime_put_noidle(dev);
> + pm_runtime_disable(dev);
[Severity: High]
Does calling pm_runtime_disable() here in qcom_pcie_ep_probe() cause a
double-disable of runtime PM?
Earlier in the probe function, the code uses devm_pm_runtime_enable(dev).
This registers a devres cleanup action that will automatically call
pm_runtime_disable() upon probe failure or device removal.
If we explicitly call pm_runtime_disable() here in the error path, it
appears it will be executed twice (once explicitly, and once via the devres
unwind). Because __pm_runtime_disable() unconditionally increments the
device's disable_depth counter, could this leave the counter permanently
elevated to 2 and break runtime PM for any subsequent probe attempts?
[Severity: High]
Will this new error label result in a double decrement of the usage_count
if the probe fails during pm_runtime_put_sync()?
Later in qcom_pcie_ep_probe(), if pm_runtime_put_sync() fails, the
function jumps to err_disable_irqs:
ret = pm_runtime_put_sync(dev);
if (ret < 0) {
dev_err(dev, "Failed to suspend device: %d\n", ret);
goto err_disable_irqs;
}
The err_disable_irqs path then falls through to this new err_rpm_put label.
Because pm_runtime_put_sync() maps to __pm_runtime_idle() with the
RPM_GET_PUT flag, the usage_count is decremented first before the idle
check even begins.
If it returns an error like -EAGAIN, the usage_count has already been
successfully dropped. Falling through to call pm_runtime_put_noidle()
again will attempt to drop the usage_count a second time. Could this
wrongfully steal a concurrent thread's reference and lead to an eventual
counter underflow?
[Severity: High]
This isn't a bug introduced by this patch, but is it correct to treat a
negative return from pm_runtime_put_sync() as a fatal probe failure?
Looking at the same code block referenced above in qcom_pcie_ep_probe(),
a return value like -EAGAIN simply means the device cannot be synchronously
suspended at this exact moment (for example, if another thread concurrently
incremented the usage counter).
Since this is a valid runtime PM state and not a hardware or initialization
failure, does tearing down the fully working device and failing the probe
cause spurious device initialization failures?
> +
> return ret;
> }
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609100747.230521-1-vulab@iscas.ac.cn?part=1
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-09 10:15 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 10:07 [PATCH] PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe() Wentao Liang
2026-06-09 10:15 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox