All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Huan He" <hehuan1@eswincomputing.com>
To: sashiko@lists.linux.dev, linux@roeck-us.net
Cc: robh@kernel.org, krzk+dt@kernel.org, conor+dt@kernel.org,
	p.zabel@pengutronix.de, linux-hwmon@vger.kernel.org,
	devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	ningyu@eswincomputing.com, linmin@eswincomputing.com,
	pinkesh.vaghela@einfochips.com, luyulin@eswincomputing.com
Subject: Re: Re: Re: [PATCH v4 2/2] hwmon: Add Eswin EIC7700 PVT sensor driver
Date: Tue, 12 May 2026 17:51:08 +0800 (GMT+08:00)	[thread overview]
Message-ID: <1526048b.617a.19e1b98ef15.Coremail.hehuan1@eswincomputing.com> (raw)
In-Reply-To: <7728dc4b.6178.19e1b936a72.Coremail.hehuan1@eswincomputing.com>

Hi Sashiko, Guenter,

Thank you very much for your detailed review. We appreciate the feedback.

> 
> Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:
> - [High] Hardware access with disabled clock when CONFIG_PM=n
> - [High] Resource leak and hardware powered on during driver unbind
> - [High] System suspend state desynchronization causes potential crashes
> - [High] Unclocked hardware access in the interrupt handler
> - [Medium] Missing synchronization on timeout path causes race condition and stale data
> - [Medium] Unvalidated sysfs input and hwmon ABI violation for temp_offset
> - [Low] Mismatch between commit message and implemented voltage channels
> --
> 
> commit 226d2b003d7fbabad9b8abaf9c71cf6fdf386a91
> Author: Huan He <hehuan1@eswincomputing.com>
> 
> hwmon: Add Eswin EIC7700 PVT sensor driver
>     
> Add support for ESWIN EIC7700 Voltage and Temperature sensor. The driver
> supports temperature and voltage monitoring with polynomial conversion,
> and provides sysfs interface for sensor data access.
>     
> The PVT IP contains one temperature sensor and four voltage sensors for
> process variation monitoring.
> 
> > The PVT IP contains one temperature sensor and four voltage sensors for
> > process variation monitoring.
> 
> This isn't a bug, but the commit message states there are four voltage
> sensors, while the driver implementation only exposes a single voltage channel.
> Could the commit message be updated to match the implementation?

I will update the commit message accordingly.

The current driver only exposes one temperature channel and one voltage
channel, so the commit text will be changed to match the implementation.

> 
> > diff --git a/drivers/hwmon/eic7700-pvt.c b/drivers/hwmon/eic7700-pvt.c
> > new file mode 100644
> > index 0000000000000..d9d6a035defeb
> > --- /dev/null
> > +++ b/drivers/hwmon/eic7700-pvt.c
> [ ... ]
> > +static irqreturn_t eic7700_pvt_hard_isr(int irq, void *data)
> > +{
> > +	struct pvt_hwmon *pvt = data;
> > +	u32 val;
> > +
> > +	eic7700_pvt_update(pvt->regs + PVT_INT, PVT_INT_CLR, PVT_INT_CLR);
> > +	/*
> > +	 * Read the data, update the cache and notify a waiter of this event.
> > +	 */
> > +	val = readl(pvt->regs + PVT_DATA);
> 
> Could this cause a bus fault if a spurious or delayed interrupt fires while
> the device is in the RPM_SUSPENDED state?
> 
> When the device is suspended, the hardware clock is gated. Accessing registers
> without checking the PM runtime state could lead to a crash. Should this use
> pm_runtime_get_if_active() to protect against accessing unclocked hardware?

Yes, this is a valid concern.

If a delayed or spurious interrupt fires while the device is
runtime-suspended, the handler may otherwise touch registers after the
clock has been gated.

I will update the ISR to guard register accesses with
pm_runtime_get_if_active() and only access the hardware when the device is
runtime active. The temporary PM reference will be dropped with
pm_runtime_put_noidle() on the handled path.

The ISR will be changed along these lines:
+   int active;
+   if (IS_ENABLED(CONFIG_PM)) {
+       active = pm_runtime_get_if_active(pvt->dev);
+       if (active <= 0)
+           return IRQ_NONE;
+   }

    ... access registers ...

+   if (IS_ENABLED(CONFIG_PM))
+       pm_runtime_put_noidle(pvt->dev);

> 
> [ ... ]
> > +static int eic7700_pvt_read_data(struct pvt_hwmon *pvt,
> > +				 enum pvt_sensor_type type, long *val)
> > +{
> > +	unsigned long timeout;
> > +	u32 data;
> > +	int ret;
> > +
> > +	/*
> > +	 * Wait for PVT conversion to complete and update the data cache. The
> > +	 * data read procedure is following: set the requested PVT sensor mode,
> > +	 * enable conversion, wait until conversion is finished, then disable
> > +	 * conversion and IRQ, and read the cached data.
> > +	 */
> > +	reinit_completion(&pvt->conversion);
> > +
> > +	eic7700_pvt_set_mode(pvt, pvt_info[type].mode);
> > +	eic7700_pvt_update(pvt->regs + PVT_ENA, PVT_ENA_EN, PVT_ENA_EN);
> > +
> > +	/*
> > +	 * Wait with timeout since in case if the sensor is suddenly powered
> > +	 * down the request won't be completed and the caller will hang up on
> > +	 * this procedure until the power is back up again. Multiply the
> > +	 * timeout by the factor of two to prevent a false timeout.
> > +	 */
> > +	timeout = 2 * usecs_to_jiffies(ktime_to_us(pvt->timeout));
> > +	ret = wait_for_completion_timeout(&pvt->conversion, timeout);
> > +
> > +	eic7700_pvt_update(pvt->regs + PVT_ENA, PVT_ENA_EN, 0);
> > +	eic7700_pvt_update(pvt->regs + PVT_INT, PVT_INT_CLR, PVT_INT_CLR);
> > +
> > +	data = READ_ONCE(pvt->data_cache);
> > +
> > +	if (!ret)
> > +		return -ETIMEDOUT;
> 
> If the timeout expires, could an interrupt still be in flight?
> 
> If so, a subsequent read request would call reinit_completion(), and the
> delayed interrupt handler could execute, complete the new request prematurely,
> and cause the read to return stale data.
> 
> Does the timeout path need to call synchronize_irq() to ensure no interrupt
> handler is running before returning?

If a timeout occurs, a delayed interrupt may still be in flight and could
complete the next conversion request after reinit_completion().

I will add synchronize_irq() on the timeout path so that any in-flight
handler has completed before returning.

The eic7700_pvt_read_data() function will be changed along these lines:

    eic7700_pvt_update(pvt->regs + PVT_INT, PVT_INT_CLR, PVT_INT_CLR);

+   if (!ret)
+       synchronize_irq(pvt->irq);
+
    data = READ_ONCE(pvt->data_cache);

> 
> [ ... ]
> > +static int eic7700_pvt_write_trim(struct pvt_hwmon *pvt, long val)
> > +{
> > +	/*
> > +	 * Update PVT trim register safely while the controller is temporarily
> > +	 * disabled.
> > +	 */
> > +	eic7700_pvt_set_trim(pvt, val);
> > +
> > +	return 0;
> > +}
> 
> Does this code validate the user input before writing it to the hardware?
> 
> Writing a negative value could truncate to 0xFFFFFFFF and overwrite reserved
> bits in the register. Additionally, the hwmon ABI specifies that temp_offset
> should be in millidegrees Celsius, but this writes the raw value directly to
> the register without scaling.

Agreed.

The current temp_offset implementation exposes the raw PVT trim register
directly, which does not match the hwmon ABI requirement for millidegree
Celsius units.

I will remove the temp_offset attribute in the next revision instead of
exposing an ABI-incompatible raw hardware value.

> 
> [ ... ]
> > +static void eic7700_pvt_disable_pm_runtime(void *data)
> > +{
> > +	struct pvt_hwmon *pvt = data;
> > +
> > +	pm_runtime_dont_use_autosuspend(pvt->dev);
> > +	pm_runtime_disable(pvt->dev);
> > +}
> 
> Can this leak the clock reference and leave the hardware powered on if the
> device is unbound?
> 
> Because pm_runtime_put_autosuspend() is used after reads, the device remains
> active for a 3-second window. If the module is removed during this window,
> pm_runtime_disable() will cancel the pending suspend but it does not suspend
> a device that is currently RPM_ACTIVE.
> 
> Should this cleanup action explicitly suspend the device if it is active?

pm_runtime_disable() alone does not guarantee that an RPM_ACTIVE device
is runtime-suspended during driver unbind, so the clock may remain enabled
if the driver is removed inside the autosuspend window.

I will fix this by forcing the device into runtime suspend before
disabling runtime PM in the cleanup path.

The eic7700_pvt_disable_pm_runtime() function will be changed along these
lines:

    pm_runtime_dont_use_autosuspend(pvt->dev);
+   pm_runtime_force_suspend(pvt->dev);
    pm_runtime_disable(pvt->dev);

> 
> [ ... ]
> > +static int eic7700_pvt_probe(struct platform_device *pdev)
> > +{
> [ ... ]
> > +	ret = eic7700_pvt_init_iface(pvt);
> > +	if (ret) {
> > +		clk_disable_unprepare(pvt->clk);
> > +		return ret;
> > +	}
> > +
> > +	clk_disable_unprepare(pvt->clk);
> 
> Is this driver intended to be built when CONFIG_PM is disabled?
> 
> If CONFIG_PM is disabled, pm_runtime_get_sync() in the hwmon read function is
> a no-op that returns 1. Because the clock is explicitly disabled at the end of
> probe() and relies entirely on PM runtime to re-enable it, a read operation
> will access the hardware registers while the clock is gated, leading to an
> external abort or system hang.

With CONFIG_PM disabled, pm_runtime_get_sync() becomes a stub and the
runtime PM callbacks are never invoked, while probe() currently gates the
clock at the end. As a result, later hwmon register accesses may touch the
hardware with the clock disabled.

I will fix this by keeping the clock enabled permanently when CONFIG_PM=n
and gating it only during driver cleanup.

The eic7700_pvt_probe() function will be changed along these lines:

ret = devm_add_action_or_reset(dev, eic7700_pvt_disable_pm_runtime, pvt);
if (ret)
    return ret;

pm_runtime_put_autosuspend(dev);

+   if (!IS_ENABLED(CONFIG_PM)) {
+       ret = clk_prepare_enable(pvt->clk);
+       if (ret)
+           return dev_err_probe(&pdev->dev, ret,
+                        "Failed to enable clock\n");
+   }

In eic7700_pvt_disable_pm_runtime:

    pm_runtime_disable(pvt->dev);
+   if (!IS_ENABLED(CONFIG_PM))
+       clk_disable_unprepare(pvt->clk);

> 
> Also, eic7700_pvt_suspend() checks !pm_runtime_status_suspended(dev), which
> is always false when CONFIG_PM=n, causing it to erroneously call
> clk_disable_unprepare() on an already disabled clock.

This part is not applicable with CONFIG_PM=n.

The driver uses pm_ptr(&eic7700_pvt_pm_ops), so with CONFIG_PM disabled
the PM callback pointer is NULL and eic7700_pvt_suspend() is not executed
by the PM core. Therefore there is no real double clk_disable_unprepare()
from that path.

> 
> [ ... ]
> > +static int __maybe_unused eic7700_pvt_suspend(struct device *dev)
> > +{
> > +	int ret = 0;
> > +
> > +	if (!pm_runtime_status_suspended(dev)) {
> > +		ret = eic7700_pvt_runtime_suspend(dev);
> > +		if (ret) {
> > +			dev_err(dev, "Failed to suspend: %d\n", ret);
> > +			return ret;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> 
> Does calling eic7700_pvt_runtime_suspend() directly leave the PM runtime
> state desynchronized?
> 
> This manually gates the clock if the device is active, but it fails to update
> the device's PM runtime status. The PM core still considers the device to be
> RPM_ACTIVE.
> 
> If a kernel thread calls pm_runtime_get_sync() before late suspend, it will
> see the state as active, return immediately without re-enabling the clock,
> and crash when accessing registers.
> 
> Should this use pm_runtime_force_suspend() and pm_runtime_force_resume() to
> safely maintain state consistency?

Calling the driver's runtime suspend/resume callbacks directly from the
system sleep callbacks does not update the PM core state, so the runtime
PM status can become inconsistent with the actual clock state.

I will switch the system sleep callbacks to pm_runtime_force_suspend() and
pm_runtime_force_resume() in the next revision.

eic7700_pvt_pm_ops will be changed along these lines:

-   SYSTEM_SLEEP_PM_OPS(eic7700_pvt_suspend, eic7700_pvt_resume)
+   SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume)

Best regards,
Huan He

  reply	other threads:[~2026-05-12  9:51 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-30  6:41 [PATCH v4 0/2] Add driver support for ESWIN EIC7700 PVT controller hehuan1
2026-04-30  6:44 ` [PATCH v4 1/2] dt-bindings: hwmon: Add Eswin EIC7700 PVT sensor hehuan1
2026-04-30  7:00   ` sashiko-bot
2026-05-12  9:14     ` Huan He
2026-05-03 12:03   ` Krzysztof Kozlowski
2026-05-12  9:16     ` Huan He
2026-05-12 14:26       ` Guenter Roeck
2026-05-13  7:03         ` Huan He
2026-05-13 19:03       ` Krzysztof Kozlowski
2026-04-30  6:45 ` [PATCH v4 2/2] hwmon: Add Eswin EIC7700 PVT sensor driver hehuan1
2026-04-30  8:08   ` sashiko-bot
2026-05-12  9:45     ` Huan He
2026-05-12  9:51       ` Huan He [this message]
2026-04-30 20:24 ` [PATCH v4 0/2] Add driver support for ESWIN EIC7700 PVT controller Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1526048b.617a.19e1b98ef15.Coremail.hehuan1@eswincomputing.com \
    --to=hehuan1@eswincomputing.com \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=krzk+dt@kernel.org \
    --cc=linmin@eswincomputing.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=luyulin@eswincomputing.com \
    --cc=ningyu@eswincomputing.com \
    --cc=p.zabel@pengutronix.de \
    --cc=pinkesh.vaghela@einfochips.com \
    --cc=robh@kernel.org \
    --cc=sashiko@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.