Linux Hardware Monitor development
 help / color / mirror / Atom feed
From: "Huan He" <hehuan1@eswincomputing.com>
To: "Guenter Roeck" <linux@roeck-us.net>
Cc: sashiko-reviews@lists.linux.dev, linux-hwmon@vger.kernel.org,
	robh@kernel.org, krzk+dt@kernel.org, conor+dt@kernel.org,
	devicetree@vger.kernel.org, 宁宇 <ningyu@eswincomputing.com>,
	"Min Lin" <linmin@eswincomputing.com>,
	"Yulin Lu" <luyulin@eswincomputing.com>,
	pinkesh.vaghela@einfochips.com
Subject: Re: Re: Re: [PATCH v5 2/2] hwmon: Add Eswin EIC7700 PVT sensor driver
Date: Thu, 28 May 2026 16:33:51 +0800 (GMT+08:00)	[thread overview]
Message-ID: <14794db6.6aba.19e6db7f058.Coremail.hehuan1@eswincomputing.com> (raw)
In-Reply-To: <d049fb4.6590.19e3f9b6875.Coremail.hehuan1@eswincomputing.com>

Hi Guenter, Sashiko,

Gentle ping on this.

I am preparing the next revision and would like to confirm whether the
remaining interrupt-ordering concern still needs a code change.

My understanding is that eic7700_pvt_init_iface() clears the interrupt
status and disables PVT_ENA_EN before the IRQ is requested, so no new
interrupt is expected in the request_irq() -> pm_runtime_enable() window.

I plan to address the confirmed issues in the next revision, including:
  - replacing pm_runtime_put_noidle() in the ISR with pm_runtime_put()
  - removing pvt_clear_data() and the related devm_add_action()
  - disabling the clock on the IRQ request error path when !CONFIG_PM

Could you please let me know whether you still expect an additional change
for this part?

Thanks for your time and review.

Best regards,
Huan He

> 
> Hi Sashiko, Guenter,
> 
> Thank you very much for your detailed review. We appreciate the feedback.
> 
> > On 5/15/26 03:03, sashiko-bot@kernel.org wrote:
> > > Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:
> > > - [Critical] The timeout calculation for sensor conversion evaluates to 0 jiffies, causing all reads to instantly fail.
> > > - [High] Missing pm_runtime_disable() in the devres cleanup path leads to an unbalanced disable_depth.
> > > - [High] Clock leakage on the IRQ request error path when CONFIG_PM is disabled.
> > > - [High] Race condition between ISR and main thread leaks PM runtime active state.
> > > - [High] Incorrect initialization order causes an IRQ storm during probe if an interrupt fires early.
> > > - [High] Lack of internal locking in eic7700_pvt_hwmon_read allows concurrent hardware state corruption.
> > > - [Medium] Useless and incorrectly ordered devres action for unblocking sysfs readers.
> > > --
> > > 
> > > commit fe78922d5101b3a07ca5ea0fbd0fd5e9cba0dcc9
> > > Author: Huan He <hehuan1@eswincomputing.com>
> > > 
> > > hwmon: Add Eswin EIC7700 PVT sensor driver
> > > 
> > > Add support for ESWIN EIC7700 Voltage and Temperature sensor. The driver
> > > supports temperature and voltage monitoring with polynomial conversion,
> > > and provides sysfs interface for sensor data access.
> > > 
> > >> diff --git a/drivers/hwmon/eic7700-pvt.c b/drivers/hwmon/eic7700-pvt.c
> > >> index 0000000000000..cdcfa031cdbdb
> > >> --- /dev/null
> > >> +++ b/drivers/hwmon/eic7700-pvt.c
> > > [ ... ]
> > >> +static irqreturn_t eic7700_pvt_hard_isr(int irq, void *data)
> > >> +{
> > > [ ... ]
> > >> +	val = readl(pvt->regs + PVT_DATA);
> > >> +	WRITE_ONCE(pvt->data_cache, FIELD_GET(PVT_DATA_OUT, val));
> > >> +	complete(&pvt->conversion);
> > >> +
> > >> +	if (IS_ENABLED(CONFIG_PM))
> > >> +		pm_runtime_put_noidle(pvt->dev);
> > >> +
> > >> +	return IRQ_HANDLED;
> > >> +}
> > > 
> > > If the main thread in eic7700_pvt_hwmon_read() drops the usage count from 2
> > > to 1 before the ISR drops it from 1 to 0, will both threads fail to trigger
> > > the idle callback?
> > > 
> > > This could leave the device indefinitely in the active state. Should the ISR
> > > use pm_runtime_put_autosuspend() or pm_runtime_put() instead of
> > > pm_runtime_put_noidle()?
> 
> I will fix this by replacing pm_runtime_put_noidle() in the ISR with
> pm_runtime_put(), so the final reference drop can still trigger runtime PM
> idle handling.
> 
> The ISR will be changed along these lines:
> 
>     complete(&pvt->conversion);
> 
>     if (IS_ENABLED(CONFIG_PM))
> -       pm_runtime_put_noidle(pvt->dev);
> +       pm_runtime_put(pvt->dev);
> 
> > > 
> > > [ ... ]
> > >> +static int eic7700_pvt_read_data(struct pvt_hwmon *pvt,
> > >> +				 enum pvt_sensor_type type, long *val)
> > >> +{
> > >> +	unsigned long timeout;
> > > [ ... ]
> > >> +	timeout = 2 * usecs_to_jiffies(ktime_to_us(pvt->timeout));
> > >> +	ret = wait_for_completion_timeout(&pvt->conversion, timeout);
> > > 
> > > Because pvt->timeout is 333 microseconds, will usecs_to_jiffies(333) evaluate
> > > to 0 on standard kernel configurations like HZ=1000?
> > > 
> > > If so, this acts as a non-blocking check and wait_for_completion_timeout()
> > > could instantly return 0, causing all reads to permanently fail with
> > > -ETIMEDOUT.
> 
> This looks like a false positive.
> 
> usecs_to_jiffies() eventually goes through _usecs_to_jiffies(), which
> rounds up instead of truncating to zero:
> 
>     static inline unsigned long _usecs_to_jiffies(const unsigned int u)
>     {
>         return (u + (USEC_PER_SEC / HZ) - 1) / (USEC_PER_SEC / HZ);
>     }
> 
> So with HZ=1000:
> 
>     usecs_to_jiffies(333) = (333 + 1000 - 1) / 1000 = 1
> 
> Therefore for pvt->timeout = 333 us, the timeout does not collapse to 0
> jiffies. The current code evaluates to 2 jiffies, so
> wait_for_completion_timeout() remains blocking rather than becoming a
> non-blocking check.
> 
> > > 
> > > [ ... ]
> > >> +static int eic7700_pvt_hwmon_read(struct device *dev,
> > >> +				  enum hwmon_sensor_types type, u32 attr,
> > >> +				  int ch, long *val)
> > >> +{
> > >> +	struct pvt_hwmon *pvt = dev_get_drvdata(dev);
> > > 
> > > The driver registers the temperature sensor with the thermal subsystem via
> > > the HWMON_C_REGISTER_TZ flag.
> > > 
> > > If thermal zone background polling directly invokes this read callback while
> > > a user concurrently reads a hwmon sysfs attribute, is there a race condition?
> > > Does this need an internal mutex to prevent concurrent hardware state
> > > corruption?
> > > 
> > 
> > False positive. Polling is through the hwmon subsystem, which protects such
> > accesses with the hwmon subsystem lock.
> > 
> > 
> > The other concerns seem real. Please address.
> > 
> > Thanks,
> > Guenter
> > 
> > > [ ... ]
> > >> +static struct pvt_hwmon *eic7700_pvt_create_data(struct platform_device *pdev)
> > >> +{
> > > [ ... ]
> > >> +	ret = devm_add_action(dev, pvt_clear_data, pvt);
> > > 
> > > Because devres actions execute in reverse order of registration (LIFO), will
> > > this action run after the hwmon device is unregistered?
> > > 
> > > Since hwmon unregistration blocks until all active sysfs read operations
> > > finish, the reader would have already completed before pvt_clear_data runs,
> > > which might make it ineffective for unblocking sysfs readers.
> 
> I will remove pvt_clear_data() and the associated devm_add_action() call.
> 
> > > 
> > > [ ... ]
> > >> +static void eic7700_pvt_disable_pm_runtime(void *data)
> > >> +{
> > >> +	struct pvt_hwmon *pvt = data;
> > >> +
> > >> +	pm_runtime_dont_use_autosuspend(pvt->dev);
> > >> +	pm_runtime_force_suspend(pvt->dev);
> > >> +
> > >> +	if (!IS_ENABLED(CONFIG_PM))
> > >> +		clk_disable_unprepare(pvt->clk);
> > >> +}
> > > 
> > > Should this custom devres cleanup action call pm_runtime_disable()?
> > > 
> > > If the module is unloaded and then re-probed, could the unbalanced
> > > disable_depth cause an "Unbalanced pm_runtime_enable!" warning and corrupt
> > > the power management state?
> 
> This looks like a false positive.
> 
> The cleanup path already calls pm_runtime_force_suspend(), and
> pm_runtime_force_suspend() itself invokes pm_runtime_disable() in the PM
> core.
> Adding another explicit pm_runtime_disable() after
> pm_runtime_force_suspend() would over-increment disable_depth and break
> the reprobe path.
> 
> > > 
> > >> +
> > >> +static int eic7700_pvt_probe(struct platform_device *pdev)
> > >> +{
> > > [ ... ]
> > >> +	if (IS_ENABLED(CONFIG_PM))
> > >> +		clk_disable_unprepare(pvt->clk);
> > >> +
> > >> +	ret = eic7700_pvt_request_irq(pvt);
> > >> +	if (ret)
> > >> +		return ret;
> > > 
> > > If CONFIG_PM is disabled, the clock remains enabled. If
> > > eic7700_pvt_request_irq() fails here, does the function return without
> > > disabling the clock? This could permanently leak the clock reference because
> > > the devres cleanup action is not yet registered.
> 
> This concern is valid.
> 
> When CONFIG_PM is disabled, the clock remains enabled after probe-time
> initialization. If eic7700_pvt_request_irq() fails before the devres PM cleanup
> action is registered, the clock can indeed be leaked.
> 
> I will fix this by explicitly disabling the clock on the IRQ request error path
> when CONFIG_PM is disabled.
> 
> The probe will be changed along these lines:
> 
>     ret = eic7700_pvt_request_irq(pvt);
> -   if (ret)
> +   if (ret) {
> +       if (!IS_ENABLED(CONFIG_PM))
> +           clk_disable_unprepare(pvt->clk);
>         return ret;
> +   }
> 
> > > 
> > >> +
> > >> +	pm_runtime_enable(&pdev->dev);
> > > 
> > > Since the IRQ is unmasked before pm_runtime_enable() is called, what happens
> > > if the hardware asserts an interrupt in this window?
> > > 
> > > If eic7700_pvt_hard_isr() fires, pm_runtime_get_if_active() would return
> > > -EINVAL and the handler would return IRQ_NONE without clearing the interrupt.
> > > Since the interrupt is level-triggered (IRQF_TRIGGER_HIGH), could this cause
> > > an infinite IRQ storm that locks up the CPU?
> 
> My understanding is that eic7700_pvt_init_iface() already clears
> the interrupt status and disables PVT_ENA_EN before the IRQ is requested,
> so no new interrupt is expected in the request_irq() -> pm_runtime_enable()
> window.
> 
> Given that, do you think an additional change is still needed here?


      reply	other threads:[~2026-05-28  8:34 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15  9:19 [PATCH v5 0/2] Add driver support for ESWIN EIC7700 PVT controller hehuan1
2026-05-15  9:20 ` [PATCH v5 1/2] dt-bindings: hwmon: Add Eswin EIC7700 PVT sensor hehuan1
2026-05-20  6:56   ` Krzysztof Kozlowski
2026-05-15  9:21 ` [PATCH v5 2/2] hwmon: Add Eswin EIC7700 PVT sensor driver hehuan1
2026-05-15 10:03   ` sashiko-bot
2026-05-15 10:24     ` Guenter Roeck
2026-05-19  9:40       ` Huan He
2026-05-28  8:33         ` Huan He [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14794db6.6aba.19e6db7f058.Coremail.hehuan1@eswincomputing.com \
    --to=hehuan1@eswincomputing.com \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=krzk+dt@kernel.org \
    --cc=linmin@eswincomputing.com \
    --cc=linux-hwmon@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=luyulin@eswincomputing.com \
    --cc=ningyu@eswincomputing.com \
    --cc=pinkesh.vaghela@einfochips.com \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox