Devicetree
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Salih Erim" <salih.erim@amd.com>
Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org
Subject: Re: [PATCH v11 4/5] iio: adc: versal-sysmon: add threshold event support
Date: Tue, 23 Jun 2026 02:15:19 +0000	[thread overview]
Message-ID: <20260623021519.6210A1F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260623014036.3865402-5-salih.erim@amd.com>

Thank you for your contribution! Sashiko AI review found 7 potential issue(s) to consider:
- [High] Use-After-Free due to flawed devres initialization order for the delayed worker.
- [High] Error paths in the IRQ handler skip clearing hardware flags but unconditionally return IRQ_HANDLED, causing a hard lockup.
- [High] The delayed worker inadvertently clears all pending interrupts from the hardware, causing unrelated events like voltage alarms to be permanently lost.
- [Medium] Atomic context violation causes a kernel panic if the device tree configures an IRQ for an I2C device.
- [Medium] Negative temperatures invoke undefined left-shift behavior and are not masked to 16 bits, causing 0xFFFFxxxx to be written to the threshold registers.
- [Medium] Race condition between sysfs event disable and the unmask worker re-enables disabled interrupts, causing spurious events.
- [Medium] Cached hysteresis value is not reverted if the hardware threshold write fails.
--

commit c435c4c9fc09d3cf9f148b3e41fd576f78414383
Author: Salih Erim <salih.erim@amd.com>

iio: adc: versal-sysmon: add threshold event support

Add threshold event support for temperature and supply voltage channels.
Temperature uses hysteresis mode while voltage uses window mode. It sets
up interrupt handlers and a delayed worker to poll for condition clear.

> diff --git a/drivers/iio/adc/versal-sysmon-core.c b/drivers/iio/adc/versal-sysmon-core.c
> --- a/drivers/iio/adc/versal-sysmon-core.c
> +++ b/drivers/iio/adc/versal-sysmon-core.c
[ ... ]
> @@ -52,6 +102,16 @@ static const struct iio_chan_spec temp_channels[] = {
>  	SYSMON_CHAN_TEMP(3, SYSMON_TEMP_MIN_MIN, "min_min"),
>  };
>  
> +static void sysmon_q8p7_to_millicelsius(s16 raw_data, int *val)
> +{
> +	*val = (raw_data * MILLIDEGREE_PER_DEGREE) >> SYSMON_FRACTIONAL_SHIFT;
> +}
> +
> +static void sysmon_millicelsius_to_q8p7(u32 *raw_data, int val)
> +{
> +	*raw_data = (val << SYSMON_FRACTIONAL_SHIFT) / MILLIDEGREE_PER_DEGREE;
> +}

[Severity: Medium]
Does this cause undefined behavior and corrupt hardware registers for negative
temperatures? 

If val is a negative temperature, the division yields a negative integer.
Assigning this to the u32 pointer raw_data will sign-extend it, filling the
upper 16 bits with 1s. Unlike the voltage path which explicitly casts to
(u16), this value is written directly to the SYSMON_TEMP_TH_UP and
SYSMON_TEMP_TH_LOW registers later in sysmon_write_event_value(), which
might corrupt reserved or mode configuration fields in the hardware.

[ ... ]
> +static int sysmon_write_event_config(struct iio_dev *indio_dev,
> +				     const struct iio_chan_spec *chan,
> +				     enum iio_event_type type,
> +				     enum iio_event_direction dir,
> +				     bool state)
> +{
[ ... ]
> +	case IIO_TEMP:
> +		if (state) {
> +			ret = regmap_write(sysmon->regmap, SYSMON_IER, mask);
> +			if (ret)
> +				return ret;
> +
> +			scoped_guard(spinlock_irq, &sysmon->irq_lock)
> +				sysmon->temp_mask &= ~mask;
> +		} else {
> +			ret = regmap_write(sysmon->regmap, SYSMON_IDR, mask);
> +			if (ret)
> +				return ret;
> +
> +			scoped_guard(spinlock_irq, &sysmon->irq_lock)
> +				sysmon->temp_mask |= mask;
> +		}
> +		return 0;

[Severity: Medium]
Is there a race condition here with sysmon_unmask_worker() that could lead to
spurious events?

If sysmon_unmask_worker() runs concurrently during the disable path, it
could acquire irq_lock right after SYSMON_IDR is written but before
sysmon->temp_mask is updated. The worker would read the old temp_mask
(indicating the event is still enabled) and write to SYSMON_IER,
re-enabling the interrupt in hardware. Once the worker releases the lock,
this sysfs thread completes the update, leaving the software state as
disabled but the hardware actively armed.

[ ... ]
> +static int sysmon_write_event_value(struct iio_dev *indio_dev,
> +				    const struct iio_chan_spec *chan,
> +				    enum iio_event_type type,
> +				    enum iio_event_direction dir,
> +				    enum iio_event_info info,
> +				    int val, int val2)
> +{
[ ... ]
> +	case IIO_TEMP:
> +		switch (info) {
[ ... ]
> +		case IIO_EV_INFO_HYSTERESIS:
> +			if (val < 0)
> +				return -EINVAL;
> +
> +			sysmon->temp_hysteresis = val;
> +
> +			return sysmon_update_temp_lower(sysmon);

[Severity: Medium]
Does this leave the cached hysteresis out of sync with hardware if the write
fails?

If sysmon_update_temp_lower() fails to write to the hardware register and
returns an error, sysmon->temp_hysteresis retains the new value. Future
writes to the upper threshold will use this corrupted cached value to
recompute and overwrite the lower threshold register.

[ ... ]
> +static int sysmon_handle_event(struct iio_dev *indio_dev, u32 event)
> +{
[ ... ]
> +	switch (event) {
> +	case SYSMON_BIT_TEMP:
> +		sysmon_push_event(indio_dev, SYSMON_TEMP_MAX);
> +
> +		ret = regmap_write(sysmon->regmap, SYSMON_IDR, BIT(SYSMON_BIT_TEMP));
> +		if (ret)
> +			return ret;

[Severity: High]
If this regmap_write() fails, does this cause an unrecoverable interrupt storm?

If we return an error early, the level-sensitive hardware interrupt remains
asserted because it was not masked or cleared. The caller, sysmon_iio_irq(),
ignores this error and returns IRQ_HANDLED, bypassing the core spurious
interrupt detector. The still-asserted hardware interrupt will immediately
re-trigger the handler, creating a hard lockup.

[ ... ]
> +static void sysmon_unmask_worker(struct work_struct *work)
> +{
> +	struct sysmon *sysmon =
> +		container_of(work, struct sysmon, sysmon_unmask_work.work);
> +	unsigned int isr;
> +
> +	/*
> +	 * If the ISR read fails, skip processing to avoid acting
> +	 * on undefined data.
> +	 */
> +	scoped_guard(spinlock_irq, &sysmon->irq_lock) {
> +		if (regmap_read(sysmon->regmap, SYSMON_ISR, &isr))
> +			break;

[Severity: Medium]
Will this trigger a 'scheduling while atomic' panic if the regmap is backed
by I2C?

The code acquires a spinlock with IRQs disabled and then calls regmap_read().
If fwnode_irq_get() succeeds for an I2C device, the regmap relies on a mutex
and will attempt to sleep, which isn't allowed while holding a spinlock. The
same issue appears in sysmon_iio_irq(), which executes in hardirq context
and blindly requests a hardirq regardless of regmap sleepability.

> +		regmap_write(sysmon->regmap, SYSMON_ISR, isr);

[Severity: High]
Does this unconditionally clear all pending interrupts from the hardware,
causing unrelated events to be permanently lost?

SYSMON_ISR is a Write-1-to-Clear register. If a voltage alarm (like
SYSMON_BIT_ALARM0) fires and is set in SYSMON_ISR right before this worker
reads the register, writing it back will clear the alarm. Since
sysmon_unmask_temp() strictly ignores all bits except the temperature mask,
the voltage alarm is permanently discarded without generating an IIO event.

> +		sysmon_unmask_temp(sysmon, isr);
> +	}
> +
> +	if (sysmon->masked_temp)
> +		schedule_delayed_work(&sysmon->sysmon_unmask_work,
> +				      msecs_to_jiffies(SYSMON_UNMASK_WORK_DELAY_MS));
> +	else
> +		regmap_write(sysmon->regmap, SYSMON_STATUS_RESET, 1);
> +}
> +
> +static irqreturn_t sysmon_iio_irq(int irq, void *data)
> +{
[ ... ]
> +	sysmon_handle_events(indio_dev, isr);
> +	schedule_delayed_work(&sysmon->sysmon_unmask_work,
> +			      msecs_to_jiffies(SYSMON_UNMASK_WORK_DELAY_MS));
> +
> +	return IRQ_HANDLED;
> +}

[Severity: High]
Since the return value of sysmon_handle_events() is ignored, will we always
return IRQ_HANDLED even if we failed to clear the hardware interrupt condition?

As mentioned above, if sysmon_handle_events() encounters an error masking
the interrupt, returning IRQ_HANDLED perpetually hides the issue from the IRQ
subsystem, leading to an interrupt storm.

[ ... ]
> +static int sysmon_init_interrupt(struct sysmon *sysmon,
> +				 struct device *dev,
> +				 struct iio_dev *indio_dev,
> +				 int irq)
> +{
[ ... ]
> +	INIT_DELAYED_WORK(&sysmon->sysmon_unmask_work, sysmon_unmask_worker);
> +
> +	ret = regmap_read(sysmon->regmap, SYSMON_IMR, &imr);
> +	if (ret)
> +		return ret;
> +	sysmon->temp_mask = imr & SYSMON_TEMP_INTR_MASK;
> +
> +	ret = devm_request_irq(dev, irq, sysmon_iio_irq, 0, "sysmon-irq", indio_dev);
> +	if (ret)
> +		return ret;
> +
> +	return devm_add_action_or_reset(dev, sysmon_disable_interrupts, sysmon);
> +}

[Severity: High]
Does this devres initialization order cause a Use-After-Free during teardown?

devres cleans up resources in reverse order (LIFO). Because
devm_add_action_or_reset() is called after devm_request_irq(),
sysmon_disable_interrupts() will execute first. It calls
cancel_delayed_work_sync(), but the IRQ handler is still active.

If an interrupt fires concurrently or is already in-flight, sysmon_iio_irq()
can unconditionally call schedule_delayed_work(), placing the work back on
the queue.

When the IRQ is subsequently freed and the sysmon memory is released, the
delayed worker will execute on freed memory 500ms later.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260623014036.3865402-1-salih.erim@amd.com?part=4

  reply	other threads:[~2026-06-23  2:15 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23  1:40 [PATCH v11 0/5] iio: adc: add Versal SysMon driver Salih Erim
2026-06-23  1:40 ` [PATCH v11 1/5] dt-bindings: iio: adc: add xlnx,versal-sysmon binding Salih Erim
2026-06-23  1:40 ` [PATCH v11 2/5] iio: adc: add Versal SysMon driver Salih Erim
2026-06-23  1:40 ` [PATCH v11 3/5] iio: adc: versal-sysmon: add I2C driver Salih Erim
2026-06-23  2:03   ` sashiko-bot
2026-06-23  1:40 ` [PATCH v11 4/5] iio: adc: versal-sysmon: add threshold event support Salih Erim
2026-06-23  2:15   ` sashiko-bot [this message]
2026-06-23  1:40 ` [PATCH v11 5/5] iio: adc: versal-sysmon: add oversampling support Salih Erim
2026-06-23  2:21   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260623021519.6210A1F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=robh@kernel.org \
    --cc=salih.erim@amd.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox