Devicetree
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Gaurav Kohli" <gaurav.kohli@oss.qualcomm.com>
Cc: devicetree@vger.kernel.org, robh@kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH v3 2/8] soc: qcom: Add support for QMI TMD cooling devices
Date: Tue, 09 Jun 2026 10:37:37 +0000	[thread overview]
Message-ID: <20260609103737.8ECB21F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260609-qmi-tmd-v3-2-291a2ff4c634@oss.qualcomm.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] Registration failure leaves an ERR_PTR in tmd->cdev, leading to an invalid pointer dereference during unregistration.
- [High] Incorrect teardown sequence in qmi_tmd_exit() allows Use-After-Free via concurrent sysfs accesses and re-queued QMI work.
- [Medium] QCOM_QMI_TMD selects QCOM_QMI_HELPERS without inheriting its NET dependency, leading to Kconfig unmet dependency warnings.
- [Medium] Mixing goto-based error handling with scope-based cleanup (__free).
- [Medium] Unprotected access to `tmd->cur_state` creates a TOCTOU race, causing the device mitigation state to permanently desynchronize from the kernel's state.
--

commit af63888f1125808b48046c97b93e006d9c65c730
Author: Casey Connolly <casey.connolly@linaro.org>

soc: qcom: Add support for QMI TMD cooling devices

Add a Qualcomm QMI Thermal Mitigation Device (TMD) to support thermal
cooling devices backed by remote subsystems.

> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index 2caadbbcf8307..a292ce57fd4a4 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -128,6 +128,16 @@ config QCOM_QMI_HELPERS
>  	tristate
>  	depends on NET
>  
> +config QCOM_QMI_TMD
> +	bool "Qualcomm QMI TMD library" if COMPILE_TEST
> +	depends on ARCH_QCOM
> +	select QCOM_QMI_HELPERS

[Severity: Medium]
Since QCOM_QMI_HELPERS specifies a dependency on NET, does selecting it here
without also inheriting the NET dependency risk unmet direct dependency warnings
at build time if CONFIG_NET=n?

> diff --git a/drivers/soc/qcom/qmi_tmd.c b/drivers/soc/qcom/qmi_tmd.c
> new file mode 100644
> index 0000000000000..9d88ae48c8647
> --- /dev/null
> +++ b/drivers/soc/qcom/qmi_tmd.c
[ ... ]
> +static int qmi_tmd_set_cur_state(struct thermal_cooling_device *cdev,
> +				 unsigned long state)
> +{
> +	struct qmi_tmd *tmd = cdev->devdata;
> +	int ret;
> +
> +	if (state > tmd->max_state)
> +		return -EINVAL;
> +
> +	if (tmd->cur_state == state)
> +		return 0;
> +
> +	ret = qmi_tmd_send_state_request(tmd, state);
> +	if (!ret)
> +		tmd->cur_state = state;

[Severity: Medium]
Is there a potential TOCTOU race here when updating tmd->cur_state without
holding qmi_tmd_cli->mutex? 
    
If the QMI server restarts concurrently, qmi_tmd_register() might read the old
state locklessly to send to the new server before this assignment completes,
permanently desynchronizing the kernel's recorded state from the actual
hardware state.

[ ... ]
> +static int qmi_tmd_register(struct qmi_tmd_client *qmi_tmd_cli,
> +			    const char *label, u8 max_state)
> +{
[ ... ]
> +found:
> +	tmd->max_state = max_state;
> +
> +	/*
> +	 * If the cooling device already exists then the QMI service went away and
> +	 * came back. So just make sure the current cooling device state is
> +	 * reflected on the remote side and then return.
> +	 */
> +	if (tmd->cdev)
> +		return qmi_tmd_send_state_request(tmd, tmd->cur_state);
> +
> +	tmd->cdev = thermal_of_cooling_device_register(dev->of_node, index,
> +						       label, tmd, &qmi_tmd_cooling_ops);
> +	if (IS_ERR(tmd->cdev))
> +		return PTR_ERR(tmd->cdev);
> +
> +	return 0;
> +}

[Severity: High]
If thermal_of_cooling_device_register() fails, could this leave an ERR_PTR
in tmd->cdev? 
    
Later in qmi_tmd_unregister(), the check if (!tmd->cdev) will be bypassed
since an ERR_PTR evaluates to true, potentially causing a crash when
thermal_cooling_device_unregister() attempts to dereference it. Should
tmd->cdev be set back to NULL on failure?

[ ... ]
> +static void qmi_tmd_svc_arrive(struct work_struct *work)
> +{
> +	struct qmi_tmd_client *qmi_tmd_cli =
> +		container_of(work, struct qmi_tmd_client, svc_arrive_work);
> +
> +	struct tmd_get_dev_list_req req = { 0 };
> +	struct tmd_get_dev_list_resp *resp __free(kfree) = NULL;
> +	int ret, i;
> +	struct qmi_txn txn;
> +
> +	resp = kzalloc_obj(*resp, GFP_KERNEL);
> +	if (!resp) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
[ ... ]
> +out:
> +	if (ret)
> +		dev_err(qmi_tmd_cli->dev, "Failed to initialize TMD service: %d\n", ret);
> +}

[Severity: Medium]
Does mixing the scope-based cleanup macro __free(kfree) with traditional
goto-based error handling (goto out;) in the same function violate the
cleanup subsystem guidelines? This mixture can lead to confusing ownership
semantics.

[ ... ]
> +void qmi_tmd_exit(struct qmi_tmd_client *qmi_tmd_cli)
> +{
> +	if (!qmi_tmd_cli)
> +		return;
> +
> +	cancel_work_sync(&qmi_tmd_cli->svc_arrive_work);
> +	qmi_handle_release(&qmi_tmd_cli->handle);
> +	qmi_tmd_unregister(qmi_tmd_cli);
> +
> +	scoped_guard(mutex, &qmi_tmd_cli->mutex)
> +		qmi_tmd_cli->connection_active = false;
> +}

[Severity: High]
Can this teardown sequence cause a use-after-free or kernel crash? 

Because qmi_handle_release() is called before qmi_tmd_unregister(), the
cooling device remains exposed to sysfs while the QMI handle is being
destroyed. A concurrent sysfs write could trigger qmi_tmd_set_cur_state(),
which would attempt to allocate a transaction ID in the destroyed qmi->txns
IDR because connection_active is still true.
            
Additionally, because cancel_work_sync() runs before qmi_handle_release(),
there is a window where incoming QMI events can re-queue svc_arrive_work,
potentially executing after the driver unbinds.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609-qmi-tmd-v3-0-291a2ff4c634@oss.qualcomm.com?part=2

  reply	other threads:[~2026-06-09 10:37 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-09 10:22 [PATCH v3 0/8] Add support for Qualcomm remoteproc subsystem cooling Gaurav Kohli
2026-06-09 10:22 ` [PATCH v3 1/8] dt-bindings: remoteproc: qcom,pas: add thermal mitigation properties Gaurav Kohli
2026-06-09 10:36   ` sashiko-bot
2026-06-09 10:47   ` Dmitry Baryshkov
2026-06-09 10:22 ` [PATCH v3 2/8] soc: qcom: Add support for QMI TMD cooling devices Gaurav Kohli
2026-06-09 10:37   ` sashiko-bot [this message]
2026-06-09 11:30   ` Dmitry Baryshkov
2026-06-09 12:08     ` Daniel Lezcano
2026-06-09 10:22 ` [PATCH v3 3/8] remoteproc: qcom: pas: register TMD thermal " Gaurav Kohli
2026-06-09 10:40   ` sashiko-bot
2026-06-09 11:05   ` Dmitry Baryshkov
2026-06-09 12:03   ` Konrad Dybcio
2026-06-09 10:22 ` [PATCH v3 4/8] arm64: dts: qcom: kodiak: Enable CDSP & Modem cooling Gaurav Kohli
2026-06-09 10:57   ` Dmitry Baryshkov
2026-06-09 10:23 ` [PATCH v3 5/8] arm64: dts: qcom: lemans: Enable CDSP cooling Gaurav Kohli
2026-06-09 10:23 ` [PATCH v3 6/8] arm64: dts: qcom: talos: " Gaurav Kohli
2026-06-09 10:23 ` [PATCH v3 7/8] arm64: dts: qcom: monaco: " Gaurav Kohli
2026-06-09 10:23 ` [PATCH v3 8/8] arm64: dts: qcom: hamoa: " Gaurav Kohli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260609103737.8ECB21F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=gaurav.kohli@oss.qualcomm.com \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox