From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A9E5035F19A for ; Tue, 9 Jun 2026 10:15:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781000155; cv=none; b=Nz3ObFXXuRJ7yLuouBVMAgHGywNN3Tm04s5BeLSkclahNFbrT9P74KW8W2haRtkIx0WpRMtvrHbNMR7X0oq3UymfUEMQgQmoOKAX5NlnIWCPkUvUAZgErb7jXK5w48+GAHpqVYh1KNlz1D3/0sroyQ1x1RToud0l4gu0QWQHJhI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781000155; c=relaxed/simple; bh=tKOEfb0f3EYV39QeYGWacQ8MlJbTnGcxpnYC4LV+O78=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=U98wJ+727/tVa+9em0fdAdMXTVqptacISviRr4e0Rli7wO4YN01Nd3+T6mW+cZYft8jx70S8AXkpdPqfsLscAWQPPOi8SGBUb2Ra9Tk1UeB+9q+iYi/93n3d2ILfupXX3r9Dwk4VY4AWDJcQAangwdh5l+pBXwmT1HG+FTX3emY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hvY1WTsn; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hvY1WTsn" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 36D4B1F00893; Tue, 9 Jun 2026 10:15:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1781000154; bh=FdM+QMbd6X7YjAv6g7Tjjy9Ajj9mzxfk15kqvlP5Kvw=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=hvY1WTsnh4BXNgab7iZG8wpxsQhGFcaZ3b5D9iUWrKaLElvfilqpSWSj6Qn1/gV/u JIlb5P5vPybDZ4Y1gDj2yi7CtUnzkRGct4cB8WufwuHKFQF3FCYg8zLUdAUaIUhapC O3KP+0mo/ij+Sh/AbdTNF1XUGAa19Z76ZTfEuzhsHaeeEK+tUFG54U02XRsKWxN2lV skL1l0n/mZM1nJ5ZTtmcGJgJvzVNXi10Vf/hQlX2hJ50XrbVNJOMxO0019Mtioebf/ GP7N0QDTeCFKkykKK58oRrOJpn5raJ/a9IIZU+SHlEajCgyiSvKF18CaiMsD4itAaM q4P78noEFXVew== From: sashiko-bot@kernel.org Subject: Re: [PATCH] PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe() Reply-To: sashiko-reviews@lists.linux.dev To: "Wentao Liang" Cc: linux-pci@vger.kernel.org In-Reply-To: <20260609100747.230521-1-vulab@iscas.ac.cn> References: <20260609100747.230521-1-vulab@iscas.ac.cn> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Tue, 09 Jun 2026 10:15:53 +0000 Message-Id: <20260609101554.36D4B1F00893@smtp.kernel.org> Precedence: bulk X-Mailing-List: linux-pci@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 3 potential issue(= s) to consider: New issues: - [High] Calling `pm_runtime_disable()` manually in the error path after us= ing `devm_pm_runtime_enable()` causes a double-disable, permanently breakin= g runtime PM. - [High] The error path for a failed `pm_runtime_put_sync()` falls through = to `err_rpm_put`, leading to a double decrement of the runtime PM usage_cou= nt. Pre-existing issues: - [High] Failing the entire probe function if `pm_runtime_put_sync()` retur= ns an error is incorrect and brittle. -- commit 2adeb10c2dada5b5e65aed93fd2d269f6cd5a790 Author: Wentao Liang PCI: qcom-ep: fix refcount leak in qcom_pcie_ep_probe() This commit attempts to fix a runtime PM reference leak in the error paths of qcom_pcie_ep_probe() by introducing a new error label. The new label calls pm_runtime_put_noidle() and pm_runtime_disable() to properly release the reference before returning the error. > diff --git a/drivers/pci/controller/dwc/pcie-qcom-ep.c b/drivers/pci/cont= roller/dwc/pcie-qcom-ep.c > index 257c2bcb5f76f..c01d4d3b49520 100644 > --- a/drivers/pci/controller/dwc/pcie-qcom-ep.c > +++ b/drivers/pci/controller/dwc/pcie-qcom-ep.c [ ... ] > @@ -932,6 +932,10 @@ static int qcom_pcie_ep_probe(struct platform_device= *pdev) > err_ep_deinit: > dw_pcie_ep_deinit(&pcie_ep->pci.ep); > =20 > +err_rpm_put: > + pm_runtime_put_noidle(dev); > + pm_runtime_disable(dev); [Severity: High] Does calling pm_runtime_disable() here in qcom_pcie_ep_probe() cause a double-disable of runtime PM? Earlier in the probe function, the code uses devm_pm_runtime_enable(dev). This registers a devres cleanup action that will automatically call pm_runtime_disable() upon probe failure or device removal. If we explicitly call pm_runtime_disable() here in the error path, it appears it will be executed twice (once explicitly, and once via the devres unwind). Because __pm_runtime_disable() unconditionally increments the device's disable_depth counter, could this leave the counter permanently elevated to 2 and break runtime PM for any subsequent probe attempts? [Severity: High] Will this new error label result in a double decrement of the usage_count if the probe fails during pm_runtime_put_sync()? Later in qcom_pcie_ep_probe(), if pm_runtime_put_sync() fails, the function jumps to err_disable_irqs: ret =3D pm_runtime_put_sync(dev); if (ret < 0) { dev_err(dev, "Failed to suspend device: %d\n", ret); goto err_disable_irqs; } The err_disable_irqs path then falls through to this new err_rpm_put label. Because pm_runtime_put_sync() maps to __pm_runtime_idle() with the RPM_GET_PUT flag, the usage_count is decremented first before the idle check even begins. If it returns an error like -EAGAIN, the usage_count has already been successfully dropped. Falling through to call pm_runtime_put_noidle() again will attempt to drop the usage_count a second time. Could this wrongfully steal a concurrent thread's reference and lead to an eventual counter underflow? [Severity: High] This isn't a bug introduced by this patch, but is it correct to treat a negative return from pm_runtime_put_sync() as a fatal probe failure? Looking at the same code block referenced above in qcom_pcie_ep_probe(), a return value like -EAGAIN simply means the device cannot be synchronously suspended at this exact moment (for example, if another thread concurrently incremented the usage counter). Since this is a valid runtime PM state and not a hardware or initialization failure, does tearing down the fully working device and failing the probe cause spurious device initialization failures? > + > return ret; > } --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260609100747.2305= 21-1-vulab@iscas.ac.cn?part=3D1