From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4C48B3AE6F3 for ; Tue, 9 Jun 2026 10:37:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781001459; cv=none; b=o07AJDFCs6J+STGvwJWl7RmN/YHJCPjHEA+WEdAbahjqFd4EOVhQO7GjUfuMnomE5sHuPy2dZ/6VjMCg03jGqAyf0xJAqNj5ftOFNeN3cTKrMVXMNvdlNlpKG3XDIZbPzZT4q7mNmfk2BFNGve3glDtGZno0xJUv5OBF2rIUicY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781001459; c=relaxed/simple; bh=FdoibTqamt5Z5fDFB55A3RIvCUtNuqj2hE0WBSR8T3k=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=a9KZH8/CBB3tc1IyOyvf8rY4jKN8a2ofOMXXBSg9shXy5HnCge2fknODeK4BLJsYsJWYGFqQ3lguImbhRd/QqePcMb48jyAWBBO7KgHCP+LtTbv61E90aV+PXy4xQAT7p32PjCga0MJ1inT3dqkeviQSL8B9tT/mi0XPjdde9RM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=h1nV5dI7; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="h1nV5dI7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8ECB21F00893; Tue, 9 Jun 2026 10:37:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1781001457; bh=j9jyYMbF8eO+m9MdVWQa9C27rlS9vJ96FjbB4vE72Dc=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=h1nV5dI7T3WtVOwR+VxsJlLCOf0Kcg/8mv4tGZmJU6SVuXs9R6/yHs1Y+XxY7meop 72NlKrjzkGvP8JxJPbyfPTCg+93pU6KSz6lALQgL0/8fCewIB4XkQ3x3cSlNlvLWNE sNohMYHlcZwWOX2BMV/H5WYDLf2+le0s5x1zfYHcSkuLHyLyubof7SO42+zr+eSThd ml6A3BNG5OOg0D3jFdZssRykNC3/QxTFcAMGB1d2zzxWNd4LszvwPmZ+D78xP35tJ+ HcjI4kns0oY+xeoY0+t5Do3Xb6g4xkCdcXEAMm1ZAawk176zFLF1BGYjRfyhoDtnY3 DAqQwRD8U5deg== From: sashiko-bot@kernel.org Subject: Re: [PATCH v3 2/8] soc: qcom: Add support for QMI TMD cooling devices Reply-To: sashiko-reviews@lists.linux.dev To: "Gaurav Kohli" Cc: devicetree@vger.kernel.org, robh@kernel.org, conor+dt@kernel.org In-Reply-To: <20260609-qmi-tmd-v3-2-291a2ff4c634@oss.qualcomm.com> References: <20260609-qmi-tmd-v3-2-291a2ff4c634@oss.qualcomm.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Tue, 09 Jun 2026 10:37:37 +0000 Message-Id: <20260609103737.8ECB21F00893@smtp.kernel.org> Precedence: bulk X-Mailing-List: devicetree@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 5 potential issue(= s) to consider: - [High] Registration failure leaves an ERR_PTR in tmd->cdev, leading to an= invalid pointer dereference during unregistration. - [High] Incorrect teardown sequence in qmi_tmd_exit() allows Use-After-Fre= e via concurrent sysfs accesses and re-queued QMI work. - [Medium] QCOM_QMI_TMD selects QCOM_QMI_HELPERS without inheriting its NET= dependency, leading to Kconfig unmet dependency warnings. - [Medium] Mixing goto-based error handling with scope-based cleanup (__fre= e). - [Medium] Unprotected access to `tmd->cur_state` creates a TOCTOU race, ca= using the device mitigation state to permanently desynchronize from the ker= nel's state. -- commit af63888f1125808b48046c97b93e006d9c65c730 Author: Casey Connolly soc: qcom: Add support for QMI TMD cooling devices Add a Qualcomm QMI Thermal Mitigation Device (TMD) to support thermal cooling devices backed by remote subsystems. > diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig > index 2caadbbcf8307..a292ce57fd4a4 100644 > --- a/drivers/soc/qcom/Kconfig > +++ b/drivers/soc/qcom/Kconfig > @@ -128,6 +128,16 @@ config QCOM_QMI_HELPERS > tristate > depends on NET > =20 > +config QCOM_QMI_TMD > + bool "Qualcomm QMI TMD library" if COMPILE_TEST > + depends on ARCH_QCOM > + select QCOM_QMI_HELPERS [Severity: Medium] Since QCOM_QMI_HELPERS specifies a dependency on NET, does selecting it here without also inheriting the NET dependency risk unmet direct dependency war= nings at build time if CONFIG_NET=3Dn? > diff --git a/drivers/soc/qcom/qmi_tmd.c b/drivers/soc/qcom/qmi_tmd.c > new file mode 100644 > index 0000000000000..9d88ae48c8647 > --- /dev/null > +++ b/drivers/soc/qcom/qmi_tmd.c [ ... ] > +static int qmi_tmd_set_cur_state(struct thermal_cooling_device *cdev, > + unsigned long state) > +{ > + struct qmi_tmd *tmd =3D cdev->devdata; > + int ret; > + > + if (state > tmd->max_state) > + return -EINVAL; > + > + if (tmd->cur_state =3D=3D state) > + return 0; > + > + ret =3D qmi_tmd_send_state_request(tmd, state); > + if (!ret) > + tmd->cur_state =3D state; [Severity: Medium] Is there a potential TOCTOU race here when updating tmd->cur_state without holding qmi_tmd_cli->mutex?=20 =20 If the QMI server restarts concurrently, qmi_tmd_register() might read the = old state locklessly to send to the new server before this assignment completes, permanently desynchronizing the kernel's recorded state from the actual hardware state. [ ... ] > +static int qmi_tmd_register(struct qmi_tmd_client *qmi_tmd_cli, > + const char *label, u8 max_state) > +{ [ ... ] > +found: > + tmd->max_state =3D max_state; > + > + /* > + * If the cooling device already exists then the QMI service went away = and > + * came back. So just make sure the current cooling device state is > + * reflected on the remote side and then return. > + */ > + if (tmd->cdev) > + return qmi_tmd_send_state_request(tmd, tmd->cur_state); > + > + tmd->cdev =3D thermal_of_cooling_device_register(dev->of_node, index, > + label, tmd, &qmi_tmd_cooling_ops); > + if (IS_ERR(tmd->cdev)) > + return PTR_ERR(tmd->cdev); > + > + return 0; > +} [Severity: High] If thermal_of_cooling_device_register() fails, could this leave an ERR_PTR in tmd->cdev?=20 =20 Later in qmi_tmd_unregister(), the check if (!tmd->cdev) will be bypassed since an ERR_PTR evaluates to true, potentially causing a crash when thermal_cooling_device_unregister() attempts to dereference it. Should tmd->cdev be set back to NULL on failure? [ ... ] > +static void qmi_tmd_svc_arrive(struct work_struct *work) > +{ > + struct qmi_tmd_client *qmi_tmd_cli =3D > + container_of(work, struct qmi_tmd_client, svc_arrive_work); > + > + struct tmd_get_dev_list_req req =3D { 0 }; > + struct tmd_get_dev_list_resp *resp __free(kfree) =3D NULL; > + int ret, i; > + struct qmi_txn txn; > + > + resp =3D kzalloc_obj(*resp, GFP_KERNEL); > + if (!resp) { > + ret =3D -ENOMEM; > + goto out; > + } [ ... ] > +out: > + if (ret) > + dev_err(qmi_tmd_cli->dev, "Failed to initialize TMD service: %d\n", re= t); > +} [Severity: Medium] Does mixing the scope-based cleanup macro __free(kfree) with traditional goto-based error handling (goto out;) in the same function violate the cleanup subsystem guidelines? This mixture can lead to confusing ownership semantics. [ ... ] > +void qmi_tmd_exit(struct qmi_tmd_client *qmi_tmd_cli) > +{ > + if (!qmi_tmd_cli) > + return; > + > + cancel_work_sync(&qmi_tmd_cli->svc_arrive_work); > + qmi_handle_release(&qmi_tmd_cli->handle); > + qmi_tmd_unregister(qmi_tmd_cli); > + > + scoped_guard(mutex, &qmi_tmd_cli->mutex) > + qmi_tmd_cli->connection_active =3D false; > +} [Severity: High] Can this teardown sequence cause a use-after-free or kernel crash?=20 Because qmi_handle_release() is called before qmi_tmd_unregister(), the cooling device remains exposed to sysfs while the QMI handle is being destroyed. A concurrent sysfs write could trigger qmi_tmd_set_cur_state(), which would attempt to allocate a transaction ID in the destroyed qmi->txns IDR because connection_active is still true. =20 Additionally, because cancel_work_sync() runs before qmi_handle_release(), there is a window where incoming QMI events can re-queue svc_arrive_work, potentially executing after the driver unbinds. --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260609-qmi-tmd-v3= -0-291a2ff4c634@oss.qualcomm.com?part=3D2