Re: [PATCH v9 2/6] platform: arm64: Add driver for EC found on Qualcomm reference devices

Devicetree
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Anvesh Jain P" <anvesh.p@oss.qualcomm.com>
Cc: robh@kernel.org, krzk+dt@kernel.org, conor+dt@kernel.org,
	devicetree@vger.kernel.org
Subject: Re: [PATCH v9 2/6] platform: arm64: Add driver for EC found on Qualcomm reference devices
Date: Tue, 12 May 2026 04:44:15 +0000	[thread overview]
Message-ID: <20260512044415.A4447C2BCC7@smtp.kernel.org> (raw)
In-Reply-To: <20260511-add-driver-for-ec-v9-2-e5437c39b7f8@oss.qualcomm.com>

Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:
- [High] The driver unconditionally fails to probe if CONFIG_THERMAL is disabled, rendering the Kconfig allowance useless.
- [High] Returning `IRQ_HANDLED` on I2C read failure causes an infinite interrupt storm.
- [High] Probe error paths fail to disable EC SCI events, leaving the hardware in an active interrupt-generating state.
- [High] Hardware cleanup in `remove()` races with the still-active devm-managed thermal cooling devices.
- [High] Insufficient validation of I2C read response lengths leads to uninitialized kernel stack memory being used.
- [Medium] The driver fails to re-initialize the EC state after receiving a reset event.
- [Medium] The driver does not implement the temperature sensors claimed in the commit message.
--

commit eeb96765c7d798e7194e02ed8e7a83c95dd5ad82
Author: Sibi Sankar <sibi.sankar@oss.qualcomm.com>

platform: arm64: Add driver for EC found on Qualcomm reference devices
    
Add Embedded controller driver support for Hamoa/Purwa/Glymur qualcomm
reference boards.

> Add Embedded controller driver support for Hamoa/Purwa/Glymur qualcomm
> reference boards. It handles fan control, temperature sensors, access
> to EC state changes and supports reporting suspend entry/exit to the
> EC.

The commit message mentions handling "temperature sensors", but I don't
see any logic to register thermal zones, expose hwmon sensors, or read
current temperatures from the EC.

Is the thermistor reading logic missing from this patch?

> diff --git a/drivers/platform/arm64/Kconfig b/drivers/platform/arm64/Kconfig
> index 10f905d7d6bfa..e32e01b2a9bdd 100644
> --- a/drivers/platform/arm64/Kconfig
> +++ b/drivers/platform/arm64/Kconfig
> @@ -90,4 +90,17 @@ config EC_LENOVO_THINKPAD_T14S
>  
>  	  Say M or Y here to include this support.
>  
> +config EC_QCOM_HAMOA
> +	tristate "Embedded Controller driver for Qualcomm Hamoa/Glymur reference devices"
> +	depends on ARCH_QCOM || COMPILE_TEST
> +	depends on I2C
> +	depends on THERMAL || THERMAL=n

[ ... ]

> diff --git a/drivers/platform/arm64/qcom-hamoa-ec.c b/drivers/platform/arm64/qcom-hamoa-ec.c
> new file mode 100644
> index 0000000000000..a018f7bf35d2a
> --- /dev/null
> +++ b/drivers/platform/arm64/qcom-hamoa-ec.c

[ ... ]

> +static int qcom_ec_probe(struct i2c_client *client)
> +{

[ ... ]

> +	for (i = 0; i < ec->thermal_cap.fan_cnt; i++) {
> +		struct qcom_ec_cooling_dev *ec_cdev = &ec->ec_cdev[i];
> +		char name[EC_FAN_NAME_SIZE];
> +
> +		scnprintf(name, sizeof(name), "qcom_ec_fan_%u", i);
> +		ec_cdev->fan_id = i + 1;
> +		ec_cdev->parent_dev = dev;
> +
> +		ec_cdev->cdev = devm_thermal_of_cooling_device_register(dev, NULL, name, ec_cdev,
> +									&qcom_ec_thermal_ops);
> +		if (IS_ERR(ec_cdev->cdev)) {
> +			return dev_err_probe(dev, PTR_ERR(ec_cdev->cdev),
> +					     "Failed to register fan%d cooling device\n", i);
> +		}
> +	}
> +
> +	return 0;
> +}

If CONFIG_THERMAL is disabled, devm_thermal_of_cooling_device_register()
evaluates to a stub that returns ERR_PTR(-ENODEV).

Does this cause qcom_ec_probe() to always fail when THERMAL=n?

Should the driver handle -ENODEV gracefully to allow non-thermal functionality
to work as intended by the Kconfig allowance?

> +static int qcom_ec_read(struct qcom_ec *ec, u8 cmd, u8 resp_len, u8 *resp)
> +{
> +	int ret;
> +
> +	ret = i2c_smbus_read_i2c_block_data(ec->client, cmd, resp_len, resp);
> +	if (ret < 0)
> +		return ret;
> +	else if (ret == 0 || ret == 0xff)
> +		return -EOPNOTSUPP;
> +
> +	if (resp[0] >= resp_len)
> +		return -EINVAL;
> +
> +	return 0;
> +}

[ ... ]

> +static int qcom_ec_read_fw_version(struct device *dev)
> +{
> +	struct i2c_client *client = to_i2c_client(dev);
> +	struct qcom_ec *ec = i2c_get_clientdata(client);
> +	struct qcom_ec_version *version = &ec->version;
> +	u8 resp[EC_FW_VERSION_RESP_LEN];
> +	int ret;
> +
> +	ret = qcom_ec_read(ec, EC_FW_VERSION_CMD, EC_FW_VERSION_RESP_LEN, resp);
> +	if (ret < 0)
> +		return ret;
> +
> +	version->main_version = resp[3];
> +	version->sub_version = resp[2];
> +	version->test_version = resp[1];

When i2c_smbus_read_i2c_block_data() succeeds, it returns the actual number
of bytes read. If a short read occurs, the remainder of the resp buffer
is untouched.

Since callers like qcom_ec_read_fw_version() allocate resp on the stack
without initialization, could this result in using uninitialized stack memory
if ret does not equal the expected length?

> +static irqreturn_t qcom_ec_irq(int irq, void *data)
> +{
> +	struct qcom_ec *ec = data;
> +	struct device *dev = &ec->client->dev;
> +	int val;
> +
> +	val = i2c_smbus_read_byte_data(ec->client, EC_SCI_EVT_READ_CMD);
> +	if (val < 0) {
> +		dev_err_ratelimited(dev, "Failed to read EC SCI Event: %d\n", val);
> +		return IRQ_HANDLED;
> +	}

If the I2C read fails here, the event is not actually cleared on the EC.

If this is a level-triggered interrupt line, will returning IRQ_HANDLED
trick the kernel's spurious interrupt detector into thinking the interrupt
was serviced, resulting in an endless loop and an interrupt storm?

Would returning IRQ_NONE be more appropriate?

> +	switch (val) {
> +	case EC_FAN1_STATUS_CHANGE_EVT:
> +		dev_dbg_ratelimited(dev, "Fan1 status changed\n");
> +		break;

[ ... ]

> +	case EC_RECOVERED_FROM_RESET_EVT:
> +		dev_dbg_ratelimited(dev, "EC recovered from reset\n");
> +		break;

When the EC resets, it typically loses volatile state like the SCI event
mask and active fan debug overrides.

Does the driver need to re-initialize the EC hardware state here to prevent
it from silently stopping events or dropping thermal overrides?

> +static int qcom_ec_probe(struct i2c_client *client)
> +{
> +	struct device *dev = &client->dev;
> +	struct qcom_ec *ec;
> +	unsigned int i;
> +	int ret;
> +
> +	ec = devm_kzalloc(dev, sizeof(*ec), GFP_KERNEL);
> +	if (!ec)
> +		return -ENOMEM;
> +
> +	ec->client = client;
> +
> +	ret = devm_request_threaded_irq(dev, client->irq, NULL, qcom_ec_irq,
> +					IRQF_ONESHOT, "qcom_ec", ec);
> +	if (ret < 0)
> +		return ret;
> +
> +	i2c_set_clientdata(client, ec);
> +
> +	ret = qcom_ec_read_fw_version(dev);
> +	if (ret < 0)
> +		return dev_err_probe(dev, ret, "Failed to read EC firmware version\n");
> +
> +	ret = qcom_ec_sci_evt_control(dev, true);
> +	if (ret < 0)
> +		return dev_err_probe(dev, ret, "Failed to enable SCI events\n");
> +
> +	ret = qcom_ec_thermal_capabilities(dev);
> +	if (ret < 0)
> +		return dev_err_probe(dev, ret, "Failed to read thermal capabilities\n");

If probe fails at qcom_ec_thermal_capabilities() or during the subsequent
cooling device registration, devres will automatically free the IRQ handler.

However, qcom_ec_sci_evt_control(dev, true) has already configured the
hardware to assert interrupts.

Could this cause an unhandled IRQ storm after the handler is removed?
Should the SCI events be disabled on probe failure, perhaps using
devm_add_action_or_reset()?

> +static void qcom_ec_remove(struct i2c_client *client)
> +{
> +	struct qcom_ec *ec = i2c_get_clientdata(client);
> +	struct device *dev = &client->dev;
> +	int ret;
> +
> +	ret = qcom_ec_sci_evt_control(dev, false);
> +	if (ret < 0)
> +		dev_err(dev, "Failed to disable SCI events: %d\n", ret);
> +
> +	for (int i = 0; i < ec->thermal_cap.fan_cnt; i++) {
> +		struct qcom_ec_cooling_dev *ec_cdev = &ec->ec_cdev[i];
> +
> +		qcom_ec_fan_debug_mode_off(ec_cdev);
> +	}
> +}

Because the cooling devices are registered with
devm_thermal_of_cooling_device_register(), they are still active when
qcom_ec_remove() runs and are only cleaned up afterward.

Is there a race condition here where the thermal core or userspace could
call qcom_ec_fan_set_cur_state() concurrently with or immediately after
qcom_ec_fan_debug_mode_off()? This would turn the fan debug mode back on
and defeat the cleanup.

Should the fan debug mode cleanup be managed via devm_add_action_or_reset()
registered before the cooling devices, so it executes in the correct order?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260511-add-driver-for-ec-v9-0-e5437c39b7f8@oss.qualcomm.com?part=2

next prev parent reply	other threads:[~2026-05-12  4:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11 12:43 [PATCH v9 0/6] Add driver for EC found on Qualcomm reference devices Anvesh Jain P
2026-05-11 12:43 ` [PATCH v9 1/6] dt-bindings: embedded-controller: Add Qualcomm reference device EC description Anvesh Jain P
2026-05-11 12:43 ` [PATCH v9 2/6] platform: arm64: Add driver for EC found on Qualcomm reference devices Anvesh Jain P
2026-05-11 14:10   ` Ilpo Järvinen
2026-05-12 11:58     ` Anvesh Jain P
2026-05-12 12:07       ` Ilpo Järvinen
2026-05-12  4:44   ` sashiko-bot [this message]
2026-05-11 12:43 ` [PATCH v9 3/6] arm64: dts: qcom: glymur-crd: Add Embedded controller node Anvesh Jain P
2026-05-11 12:43 ` [PATCH v9 4/6] arm64: dts: qcom: x1-crd: " Anvesh Jain P
2026-05-11 12:43 ` [PATCH v9 5/6] arm64: dts: qcom: hamoa-iot-evk: " Anvesh Jain P
2026-05-11 12:43 ` [PATCH v9 6/6] arm64: defconfig: Enable Qualcomm reference device EC driver Anvesh Jain P
2026-05-11 13:40   ` Dmitry Baryshkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260512044415.A4447C2BCC7@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=anvesh.p@oss.qualcomm.com \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=krzk+dt@kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox