Re: [PATCH] PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe()

Linux PCI subsystem development
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Wentao Liang" <vulab@iscas.ac.cn>
Cc: linux-pci@vger.kernel.org
Subject: Re: [PATCH] PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe()
Date: Tue, 09 Jun 2026 10:15:53 +0000	[thread overview]
Message-ID: <20260609101554.36D4B1F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260609100747.230521-1-vulab@iscas.ac.cn>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

New issues:
- [High] Calling `pm_runtime_disable()` manually in the error path after using `devm_pm_runtime_enable()` causes a double-disable, permanently breaking runtime PM.
- [High] The error path for a failed `pm_runtime_put_sync()` falls through to `err_rpm_put`, leading to a double decrement of the runtime PM usage_count.

Pre-existing issues:
- [High] Failing the entire probe function if `pm_runtime_put_sync()` returns an error is incorrect and brittle.
--

commit 2adeb10c2dada5b5e65aed93fd2d269f6cd5a790
Author: Wentao Liang <vulab@iscas.ac.cn>

PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe()

This commit attempts to fix a runtime PM reference leak in the error paths
of qcom_pcie_ep_probe() by introducing a new error label. The new label
calls pm_runtime_put_noidle() and pm_runtime_disable() to properly release
the reference before returning the error.

> diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/controller/dwc/pcie-qcom-ep.c
> index 257c2bcb5f76f..c01d4d3b49520 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c
[ ... ]
> @@ -932,6 +932,10 @@ static int qcom_pcie_ep_probe(struct platform_device *pdev)
>  err_ep_deinit:
>  	dw_pcie_ep_deinit(&pcie_ep->pci.ep);
>  
> +err_rpm_put:
> +	pm_runtime_put_noidle(dev);
> +	pm_runtime_disable(dev);

[Severity: High]
Does calling pm_runtime_disable() here in qcom_pcie_ep_probe() cause a
double-disable of runtime PM?

Earlier in the probe function, the code uses devm_pm_runtime_enable(dev).
This registers a devres cleanup action that will automatically call
pm_runtime_disable() upon probe failure or device removal.

If we explicitly call pm_runtime_disable() here in the error path, it
appears it will be executed twice (once explicitly, and once via the devres
unwind). Because __pm_runtime_disable() unconditionally increments the
device's disable_depth counter, could this leave the counter permanently
elevated to 2 and break runtime PM for any subsequent probe attempts?

[Severity: High]
Will this new error label result in a double decrement of the usage_count
if the probe fails during pm_runtime_put_sync()?

Later in qcom_pcie_ep_probe(), if pm_runtime_put_sync() fails, the
function jumps to err_disable_irqs:

    ret = pm_runtime_put_sync(dev);
    if (ret < 0) {
        dev_err(dev, "Failed to suspend device: %d\n", ret);
        goto err_disable_irqs;
    }

The err_disable_irqs path then falls through to this new err_rpm_put label.
Because pm_runtime_put_sync() maps to __pm_runtime_idle() with the
RPM_GET_PUT flag, the usage_count is decremented first before the idle
check even begins.

If it returns an error like -EAGAIN, the usage_count has already been
successfully dropped. Falling through to call pm_runtime_put_noidle()
again will attempt to drop the usage_count a second time. Could this
wrongfully steal a concurrent thread's reference and lead to an eventual
counter underflow?

[Severity: High]
This isn't a bug introduced by this patch, but is it correct to treat a
negative return from pm_runtime_put_sync() as a fatal probe failure?

Looking at the same code block referenced above in qcom_pcie_ep_probe(),
a return value like -EAGAIN simply means the device cannot be synchronously
suspended at this exact moment (for example, if another thread concurrently
incremented the usage counter).

Since this is a valid runtime PM state and not a hardware or initialization
failure, does tearing down the fully working device and failing the probe
cause spurious device initialization failures?

> +
>  	return ret;
>  }

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609100747.230521-1-vulab@iscas.ac.cn?part=1

     prev parent reply	other threads:[~2026-06-09 10:15 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-09 10:07 [PATCH] PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe() Wentao Liang
2026-06-09 10:15 ` sashiko-bot [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260609101554.36D4B1F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=vulab@iscas.ac.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox