Re: [PATCH v11 4/5] iio: adc: versal-sysmon: add threshold event support

All of lore.kernel.org
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Salih Erim" <salih.erim@amd.com>
Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org
Subject: Re: [PATCH v11 4/5] iio: adc: versal-sysmon: add threshold event support
Date: Tue, 23 Jun 2026 02:15:19 +0000	[thread overview]
Message-ID: <20260623021519.6210A1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260623014036.3865402-5-salih.erim@amd.com>

Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:
- [High] Use-After-Free due to flawed devres initialization order for the delayed worker.
- [High] Error paths in the IRQ handler skip clearing hardware flags but unconditionally return IRQ_HANDLED, causing a hard lockup.
- [High] The delayed worker inadvertently clears all pending interrupts from the hardware, causing unrelated events like voltage alarms to be permanently lost.
- [Medium] Atomic context violation causes a kernel panic if the device tree configures an IRQ for an I2C device.
- [Medium] Negative temperatures invoke undefined left-shift behavior and are not masked to 16 bits, causing 0xFFFFxxxx to be written to the threshold registers.
- [Medium] Race condition between sysfs event disable and the unmask worker re-enables disabled interrupts, causing spurious events.
- [Medium] Cached hysteresis value is not reverted if the hardware threshold write fails.
--

commit c435c4c9fc09d3cf9f148b3e41fd576f78414383
Author: Salih Erim <salih.erim@amd.com>

iio: adc: versal-sysmon: add threshold event support

Add threshold event support for temperature and supply voltage channels.
Temperature uses hysteresis mode while voltage uses window mode. It sets
up interrupt handlers and a delayed worker to poll for condition clear.

> diff --git a/drivers/iio/adc/versal-sysmon-core.c b/drivers/iio/adc/versal-sysmon-core.c
> --- a/drivers/iio/adc/versal-sysmon-core.c
> +++ b/drivers/iio/adc/versal-sysmon-core.c
[ ... ]
> @@ -52,6 +102,16 @@ static const struct iio_chan_spec temp_channels[] = {
>  	SYSMON_CHAN_TEMP(3, SYSMON_TEMP_MIN_MIN, "min_min"),
>  };
>  
> +static void sysmon_q8p7_to_millicelsius(s16 raw_data, int *val)
> +{
> +	*val = (raw_data * MILLIDEGREE_PER_DEGREE) >> SYSMON_FRACTIONAL_SHIFT;
> +}
> +
> +static void sysmon_millicelsius_to_q8p7(u32 *raw_data, int val)
> +{
> +	*raw_data = (val << SYSMON_FRACTIONAL_SHIFT) / MILLIDEGREE_PER_DEGREE;
> +}

[Severity: Medium]
Does this cause undefined behavior and corrupt hardware registers for negative
temperatures? 

If val is a negative temperature, the division yields a negative integer.
Assigning this to the u32 pointer raw_data will sign-extend it, filling the
upper 16 bits with 1s. Unlike the voltage path which explicitly casts to
(u16), this value is written directly to the SYSMON_TEMP_TH_UP and
SYSMON_TEMP_TH_LOW registers later in sysmon_write_event_value(), which
might corrupt reserved or mode configuration fields in the hardware.

[ ... ]
> +static int sysmon_write_event_config(struct iio_dev *indio_dev,
> +				     const struct iio_chan_spec *chan,
> +				     enum iio_event_type type,
> +				     enum iio_event_direction dir,
> +				     bool state)
> +{
[ ... ]
> +	case IIO_TEMP:
> +		if (state) {
> +			ret = regmap_write(sysmon->regmap, SYSMON_IER, mask);
> +			if (ret)
> +				return ret;
> +
> +			scoped_guard(spinlock_irq, &sysmon->irq_lock)
> +				sysmon->temp_mask &= ~mask;
> +		} else {
> +			ret = regmap_write(sysmon->regmap, SYSMON_IDR, mask);
> +			if (ret)
> +				return ret;
> +
> +			scoped_guard(spinlock_irq, &sysmon->irq_lock)
> +				sysmon->temp_mask |= mask;
> +		}
> +		return 0;

[Severity: Medium]
Is there a race condition here with sysmon_unmask_worker() that could lead to
spurious events?

If sysmon_unmask_worker() runs concurrently during the disable path, it
could acquire irq_lock right after SYSMON_IDR is written but before
sysmon->temp_mask is updated. The worker would read the old temp_mask
(indicating the event is still enabled) and write to SYSMON_IER,
re-enabling the interrupt in hardware. Once the worker releases the lock,
this sysfs thread completes the update, leaving the software state as
disabled but the hardware actively armed.

[ ... ]
> +static int sysmon_write_event_value(struct iio_dev *indio_dev,
> +				    const struct iio_chan_spec *chan,
> +				    enum iio_event_type type,
> +				    enum iio_event_direction dir,
> +				    enum iio_event_info info,
> +				    int val, int val2)
> +{
[ ... ]
> +	case IIO_TEMP:
> +		switch (info) {
[ ... ]
> +		case IIO_EV_INFO_HYSTERESIS:
> +			if (val < 0)
> +				return -EINVAL;
> +
> +			sysmon->temp_hysteresis = val;
> +
> +			return sysmon_update_temp_lower(sysmon);

[Severity: Medium]
Does this leave the cached hysteresis out of sync with hardware if the write
fails?

If sysmon_update_temp_lower() fails to write to the hardware register and
returns an error, sysmon->temp_hysteresis retains the new value. Future
writes to the upper threshold will use this corrupted cached value to
recompute and overwrite the lower threshold register.

[ ... ]
> +static int sysmon_handle_event(struct iio_dev *indio_dev, u32 event)
> +{
[ ... ]
> +	switch (event) {
> +	case SYSMON_BIT_TEMP:
> +		sysmon_push_event(indio_dev, SYSMON_TEMP_MAX);
> +
> +		ret = regmap_write(sysmon->regmap, SYSMON_IDR, BIT(SYSMON_BIT_TEMP));
> +		if (ret)
> +			return ret;

[Severity: High]
If this regmap_write() fails, does this cause an unrecoverable interrupt storm?

If we return an error early, the level-sensitive hardware interrupt remains
asserted because it was not masked or cleared. The caller, sysmon_iio_irq(),
ignores this error and returns IRQ_HANDLED, bypassing the core spurious
interrupt detector. The still-asserted hardware interrupt will immediately
re-trigger the handler, creating a hard lockup.

[ ... ]
> +static void sysmon_unmask_worker(struct work_struct *work)
> +{
> +	struct sysmon *sysmon =
> +		container_of(work, struct sysmon, sysmon_unmask_work.work);
> +	unsigned int isr;
> +
> +	/*
> +	 * If the ISR read fails, skip processing to avoid acting
> +	 * on undefined data.
> +	 */
> +	scoped_guard(spinlock_irq, &sysmon->irq_lock) {
> +		if (regmap_read(sysmon->regmap, SYSMON_ISR, &isr))
> +			break;

[Severity: Medium]
Will this trigger a 'scheduling while atomic' panic if the regmap is backed
by I2C?

The code acquires a spinlock with IRQs disabled and then calls regmap_read().
If fwnode_irq_get() succeeds for an I2C device, the regmap relies on a mutex
and will attempt to sleep, which isn't allowed while holding a spinlock. The
same issue appears in sysmon_iio_irq(), which executes in hardirq context
and blindly requests a hardirq regardless of regmap sleepability.

> +		regmap_write(sysmon->regmap, SYSMON_ISR, isr);

[Severity: High]
Does this unconditionally clear all pending interrupts from the hardware,
causing unrelated events to be permanently lost?

SYSMON_ISR is a Write-1-to-Clear register. If a voltage alarm (like
SYSMON_BIT_ALARM0) fires and is set in SYSMON_ISR right before this worker
reads the register, writing it back will clear the alarm. Since
sysmon_unmask_temp() strictly ignores all bits except the temperature mask,
the voltage alarm is permanently discarded without generating an IIO event.

> +		sysmon_unmask_temp(sysmon, isr);
> +	}
> +
> +	if (sysmon->masked_temp)
> +		schedule_delayed_work(&sysmon->sysmon_unmask_work,
> +				      msecs_to_jiffies(SYSMON_UNMASK_WORK_DELAY_MS));
> +	else
> +		regmap_write(sysmon->regmap, SYSMON_STATUS_RESET, 1);
> +}
> +
> +static irqreturn_t sysmon_iio_irq(int irq, void *data)
> +{
[ ... ]
> +	sysmon_handle_events(indio_dev, isr);
> +	schedule_delayed_work(&sysmon->sysmon_unmask_work,
> +			      msecs_to_jiffies(SYSMON_UNMASK_WORK_DELAY_MS));
> +
> +	return IRQ_HANDLED;
> +}

[Severity: High]
Since the return value of sysmon_handle_events() is ignored, will we always
return IRQ_HANDLED even if we failed to clear the hardware interrupt condition?

As mentioned above, if sysmon_handle_events() encounters an error masking
the interrupt, returning IRQ_HANDLED perpetually hides the issue from the IRQ
subsystem, leading to an interrupt storm.

[ ... ]
> +static int sysmon_init_interrupt(struct sysmon *sysmon,
> +				 struct device *dev,
> +				 struct iio_dev *indio_dev,
> +				 int irq)
> +{
[ ... ]
> +	INIT_DELAYED_WORK(&sysmon->sysmon_unmask_work, sysmon_unmask_worker);
> +
> +	ret = regmap_read(sysmon->regmap, SYSMON_IMR, &imr);
> +	if (ret)
> +		return ret;
> +	sysmon->temp_mask = imr & SYSMON_TEMP_INTR_MASK;
> +
> +	ret = devm_request_irq(dev, irq, sysmon_iio_irq, 0, "sysmon-irq", indio_dev);
> +	if (ret)
> +		return ret;
> +
> +	return devm_add_action_or_reset(dev, sysmon_disable_interrupts, sysmon);
> +}

[Severity: High]
Does this devres initialization order cause a Use-After-Free during teardown?

devres cleans up resources in reverse order (LIFO). Because
devm_add_action_or_reset() is called after devm_request_irq(),
sysmon_disable_interrupts() will execute first. It calls
cancel_delayed_work_sync(), but the IRQ handler is still active.

If an interrupt fires concurrently or is already in-flight, sysmon_iio_irq()
can unconditionally call schedule_delayed_work(), placing the work back on
the queue.

When the IRQ is subsequently freed and the sysmon memory is released, the
delayed worker will execute on freed memory 500ms later.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260623014036.3865402-1-salih.erim@amd.com?part=4

next prev parent reply	other threads:[~2026-06-23  2:15 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23  1:40 [PATCH v11 0/5] iio: adc: add Versal SysMon driver Salih Erim
2026-06-23  1:40 ` [PATCH v11 1/5] dt-bindings: iio: adc: add xlnx,versal-sysmon binding Salih Erim
2026-06-23  1:40 ` [PATCH v11 2/5] iio: adc: add Versal SysMon driver Salih Erim
2026-06-23  1:40 ` [PATCH v11 3/5] iio: adc: versal-sysmon: add I2C driver Salih Erim
2026-06-23  2:03   ` sashiko-bot
2026-06-23 11:33   ` Erim, Salih
2026-06-23  1:40 ` [PATCH v11 4/5] iio: adc: versal-sysmon: add threshold event support Salih Erim
2026-06-23  2:15   ` sashiko-bot [this message]
2026-06-23 11:36   ` Erim, Salih
2026-06-23  1:40 ` [PATCH v11 5/5] iio: adc: versal-sysmon: add oversampling support Salih Erim
2026-06-23  2:21   ` sashiko-bot
2026-06-23 11:37   ` Erim, Salih

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623021519.6210A1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=robh@kernel.org \
    --cc=salih.erim@amd.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.