Re: [PATCH v3 4/5] iio: adc: versal-sysmon: add threshold event support

Devicetree
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Salih Erim" <salih.erim@amd.com>
Cc: robh@kernel.org, conor+dt@kernel.org, devicetree@vger.kernel.org
Subject: Re: [PATCH v3 4/5] iio: adc: versal-sysmon: add threshold event support
Date: Wed, 27 May 2026 13:31:35 +0000	[thread overview]
Message-ID: <20260527133135.C1ED11F000E9@smtp.kernel.org> (raw)
In-Reply-To: <20260527114211.174288-5-salih.erim@amd.com>

Thank you for your contribution! Sashiko AI review found 11 potential issue(s) to consider:
- [High] Data race and missing synchronization when updating `sysmon->temp_mask` and hardware interrupt mask registers.
- [High] Voltage threshold alarms act as one-shot events and become permanently disabled upon triggering.
- [High] Signed integer overflow in `sysmon_supply_processedtoraw` when scaling unconstrained user-supplied thresholds.
- [High] Unhandled regmap error during voltage alarm processing leads to an IRQ storm.
- [High] `sysmon_millicelsius_to_q8p7()` triggers undefined behavior and writes sign-extended garbage to reserved upper bits.
- [High] Requesting a hardirq handler for I2C devices will trigger a 'scheduling while atomic' kernel panic.
- [High] Lockless read of `sysmon->masked_temp` causes a TOCTOU race, and undocumented write to `SYSMON_STATUS_RESET` destroys telemetry data.
- [High] Voltage threshold writes clobber the format and mode bits in the hardware register.
- [High] `sysmon_update_temp_lower` is vulnerable to a signed integer underflow when calculating the lower threshold.
- [Medium] Reading the event configuration of a temperature channel during an active alarm incorrectly returns 0 (disabled).
- [Medium] Device interrupts are not disabled in the hardware during driver teardown.
--

commit bf3f4a0490a50d3240978d97650f909908cd4a27
Author: Salih Erim <salih.erim@amd.com>

iio: adc: versal-sysmon: add threshold event support

Add threshold event support for temperature and supply voltage channels.

> diff --git a/drivers/iio/adc/versal-sysmon-core.c b/drivers/iio/adc/versal-sysmon-core.c
> index ebe052f69820f..04977c9c8871f 100644
> --- a/drivers/iio/adc/versal-sysmon-core.c
> +++ b/drivers/iio/adc/versal-sysmon-core.c
[ ... ]
> +static void sysmon_millicelsius_to_q8p7(u32 *raw_data, int val)
> +{
> +	*raw_data = (val << SYSMON_FRACTIONAL_SHIFT) / (int)MILLI;
> +}

[Severity: High]
Will left-shifting a negative val cause undefined behavior in C? 

Additionally, when the division produces a negative integer, does the
implicit cast to u32 sign-extend and set the upper 16 bits to 1? Could
this clobber reserved upper bits when written to the threshold registers?

[ ... ]
> +static void sysmon_supply_processedtoraw(int val, u32 reg_val, u32 *raw_data)
> +{
> +	int exponent = FIELD_GET(SYSMON_MODE_MASK, reg_val);
> +	int format = FIELD_GET(SYSMON_FMT_MASK, reg_val);
> +	int scale, tmp;
> +
> +	scale = BIT(SYSMON_SUPPLY_MANTISSA_BITS - exponent);
> +	tmp = (val * scale) / (int)MILLI;

[Severity: High]
Can val * scale silently overflow a 32-bit signed integer if userspace
provides a large threshold value (e.g., > 32767)? 

Would it be safer to use a 64-bit cast here to avoid undefined behavior
before clamping?

[ ... ]
> +static int sysmon_read_event_config(struct iio_dev *indio_dev,
> +				    const struct iio_chan_spec *chan,
> +				    enum iio_event_type type,
> +				    enum iio_event_direction dir)
> +{
> +	u32 alarm_event_mask = sysmon_get_event_mask(chan->address);
> +	struct sysmon *sysmon = iio_priv(indio_dev);
> +	unsigned int imr;
> +	int config_value;
> +	int ret;
> +
> +	ret = regmap_read(sysmon->regmap, SYSMON_IMR, &imr);
> +	if (ret)
> +		return ret;

[Severity: Medium]
When a temperature alarm triggers, the interrupt handler temporarily masks
the interrupt in hardware to prevent an IRQ storm.

If userspace reads the sysfs _en attribute during this period, will it
incorrectly read 0 (disabled) instead of the logical administrative state
of the event?

[ ... ]
> +static int sysmon_write_event_config(struct iio_dev *indio_dev,
> +				     const struct iio_chan_spec *chan,
> +				     enum iio_event_type type,
> +				     enum iio_event_direction dir,
> +				     bool state)
> +{
> +	u32 offset = SYSMON_ALARM_OFFSET(chan->address);
> +	u32 ier = sysmon_get_event_mask(chan->address);
> +	struct sysmon *sysmon = iio_priv(indio_dev);
> +	unsigned int alarm_config;
> +	int ret;
> +
> +	guard(mutex)(&sysmon->lock);
[ ... ]
> +	if (chan->type == IIO_TEMP) {
> +		if (state) {
> +			ret = regmap_write(sysmon->regmap, SYSMON_IER, ier);
> +			if (ret)
> +				return ret;
> +			sysmon->temp_mask &= ~ier;

[Severity: High]
sysmon_write_event_config() updates sysmon->temp_mask and modifies the
hardware interrupt registers while holding sysmon->lock (a mutex).

However, sysmon_unmask_worker() reads sysmon->temp_mask and modifies the
hardware interrupt registers while holding sysmon->irq_lock (a spinlock).

Since these paths use completely different locks, can they execute
concurrently and create a data race that leaves the hardware interrupt
mask permanently out of sync with the software state?

> +		} else {
> +			ret = regmap_write(sysmon->regmap, SYSMON_IDR, ier);
> +			if (ret)
> +				return ret;
> +			sysmon->temp_mask |= ier;
> +		}
> +	}
> +
> +	return 0;
> +}
[ ... ]
> +static int sysmon_update_temp_lower(struct sysmon *sysmon, int address)
> +{
> +	unsigned int upper_reg;
> +	int upper_mc, lower_mc, hysteresis;
> +	u32 raw_val;
> +	int upper_off, lower_off, ret;
[ ... ]
> +	ret = regmap_read(sysmon->regmap, upper_off, &upper_reg);
> +	if (ret)
> +		return ret;
> +	sysmon_q8p7_to_millicelsius(upper_reg, &upper_mc);
> +
> +	lower_mc = upper_mc - hysteresis;

[Severity: High]
If userspace configures an exceptionally large hysteresis value while upper_mc
is negative, can this subtraction underflow a 32-bit signed integer?

This might produce a large positive value, making the lower threshold higher
than the upper threshold and breaking the alarm logic.

> +	sysmon_millicelsius_to_q8p7(&raw_val, lower_mc);
> +
> +	return regmap_write(sysmon->regmap, lower_off, raw_val);
> +}
[ ... ]
> +static int sysmon_write_event_value(struct iio_dev *indio_dev,
> +				    const struct iio_chan_spec *chan,
> +				    enum iio_event_type type,
> +				    enum iio_event_direction dir,
> +				    enum iio_event_info info,
> +				    int val, int val2)
> +{
[ ... ]
> +	if (chan->type == IIO_VOLTAGE) {
> +		offset = sysmon_supply_thresh_offset(chan->address, dir);
> +		if (offset < 0)
> +			return offset;
> +		ret = regmap_read(sysmon->regmap, offset, &reg_val);
> +		if (ret)
> +			return ret;
> +		sysmon_supply_processedtoraw(val, reg_val, &raw_val);
> +		return regmap_write(sysmon->regmap, offset, raw_val);

[Severity: High]
sysmon_supply_processedtoraw() zero-extends the 16-bit threshold.

Does calling regmap_write() here overwrite the format and mode bits
(bits 16-18) with 0s? Should this be a regmap_update_bits() call instead
to preserve those configuration bits?

> +	}
> +
> +	return -EINVAL;
> +}
[ ... ]
> +static int sysmon_handle_event(struct iio_dev *indio_dev, u32 event)
> +{
[ ... ]
> +	case SYSMON_BIT_ALARM0:
> +	case SYSMON_BIT_ALARM1:
> +	case SYSMON_BIT_ALARM2:
> +	case SYSMON_BIT_ALARM3:
> +	case SYSMON_BIT_ALARM4:
> +		ret = regmap_read(sysmon->regmap, alarm_flag_offset, &reg_val);
> +		if (ret)
> +			return ret;
> +		alarm_flag_reg = reg_val;
> +
> +		for_each_set_bit(bit, &alarm_flag_reg,
> +				 SYSMON_ALARM_BITS_PER_REG) {
> +			address = bit + (SYSMON_ALARM_BITS_PER_REG * event);
> +			sysmon_push_event(indio_dev, address);
> +			ret = regmap_update_bits(sysmon->regmap,
> +						 alarm_reg_offset,
> +						 BIT(bit), 0);

[Severity: High]
When a voltage threshold triggers, the driver clears the corresponding bit
in SYSMON_ALARM_REG to prevent an interrupt storm.

Is there a mechanism to ever re-enable this alarm later? If not, does this
make the voltage threshold alarms act as one-shot events, violating the
continuous nature of IIO events?

> +			if (ret)
> +				return ret;

[Severity: High]
If regmap_update_bits() fails here, the function returns early and skips the
mandatory regmap_write() to alarm_flag_offset that clears the hardware
interrupt status.

Since sysmon_handle_events() ignores the return value and sysmon_iio_irq()
returns IRQ_HANDLED, will this cause the interrupt controller to immediately
re-trigger the handler, resulting in an infinite IRQ storm?

> +		}
> +		ret = regmap_write(sysmon->regmap, alarm_flag_offset,
> +				   alarm_flag_reg);
> +		if (ret)
> +			return ret;
> +		break;
> +
> +	default:
> +		break;
> +	}
> +
> +	return 0;
> +}
[ ... ]
> +static void sysmon_unmask_worker(struct work_struct *work)
> +{
> +	struct sysmon *sysmon = container_of(work, struct sysmon,
> +					     sysmon_unmask_work.work);
> +	unsigned int isr;
> +
> +	spin_lock_irq(&sysmon->irq_lock);
> +	regmap_read(sysmon->regmap, SYSMON_ISR, &isr);
> +	regmap_write(sysmon->regmap, SYSMON_ISR, isr);
> +	sysmon_unmask_temp(sysmon, isr);
> +	spin_unlock_irq(&sysmon->irq_lock);
> +
> +	if (sysmon->masked_temp)
> +		schedule_delayed_work(&sysmon->sysmon_unmask_work,
> +				      msecs_to_jiffies(SYSMON_UNMASK_WORK_DELAY_MS));
> +	else
> +		regmap_write(sysmon->regmap, SYSMON_STATUS_RESET, 1);

[Severity: High]
Is it safe to read sysmon->masked_temp after dropping sysmon->irq_lock?
Could the hardware interrupt handler concurrently update it, leading to a
TOCTOU race?

Additionally, if it evaluates to false, the worker writes to
SYSMON_STATUS_RESET. Doesn't this register clear the hardware Min/Max
status registers (e.g., SYSMON_TEMP_MAX)? Could this permanently destroy
historical telemetry data simply because an alarm cleared?

> +}
[ ... ]
> +static int sysmon_init_interrupt(struct sysmon *sysmon,
> +				 struct device *dev,
> +				 struct iio_dev *indio_dev,
> +				 int irq)
> +{
[ ... ]
> +	return devm_request_irq(dev, irq, sysmon_iio_irq, 0,
> +				"sysmon-irq", indio_dev);

[Severity: High]
The driver unconditionally uses devm_request_irq() to register a hardirq
handler if an interrupt is present in the Device Tree.

If the device is connected via I2C, the hardirq handler will execute
regmap_read() on the I2C regmap, which requires sleeping locks. Will
sleeping inside the hardirq context trigger a scheduling while atomic
kernel panic?

Would it be better to check !regmap_might_sleep() or use
devm_request_threaded_irq() here?

> +}
[ ... ]
>  int sysmon_core_probe(struct device *dev, struct regmap *regmap)
>  {
[ ... ]
> +		ret = sysmon_init_interrupt(sysmon, dev, indio_dev, irq);
> +		if (ret)
> +			return ret;
> +	}
> +
>  	return devm_iio_device_register(dev, indio_dev);

[Severity: Medium]
Does the driver need a teardown hook to physically disable the device's
interrupt generation on module unload?

While devm_request_irq deregisters the kernel handler, the hardware may
continue asserting the physical IRQ line to the interrupt controller,
potentially causing issues if the IRQ is shared.

>  }
>  EXPORT_SYMBOL_GPL(sysmon_core_probe);

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260527114211.174288-1-salih.erim@amd.com?part=4

next prev parent reply	other threads:[~2026-05-27 13:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-27 11:42 [PATCH v3 0/5] iio: adc: add AMD/Xilinx Versal SysMon driver Salih Erim
2026-05-27 11:42 ` [PATCH v3 1/5] dt-bindings: iio: adc: add xlnx,versal-sysmon binding Salih Erim
2026-05-28  8:38   ` Krzysztof Kozlowski
2026-05-27 11:42 ` [PATCH v3 2/5] iio: adc: add Versal SysMon driver Salih Erim
2026-05-27 12:29   ` sashiko-bot
2026-05-28 12:24   ` Jonathan Cameron
2026-05-28 22:07     ` Erim, Salih
2026-05-27 11:42 ` [PATCH v3 3/5] iio: adc: versal-sysmon: add I2C driver Salih Erim
2026-05-27 13:01   ` sashiko-bot
2026-05-28 12:42   ` Jonathan Cameron
2026-05-28 22:12     ` Erim, Salih
2026-05-27 11:42 ` [PATCH v3 4/5] iio: adc: versal-sysmon: add threshold event support Salih Erim
2026-05-27 13:31   ` sashiko-bot [this message]
2026-05-28 13:01   ` Jonathan Cameron
2026-05-28 22:18     ` Erim, Salih
2026-05-27 11:42 ` [PATCH v3 5/5] iio: adc: versal-sysmon: add oversampling support Salih Erim
2026-05-27 13:52   ` sashiko-bot
2026-05-28 13:05   ` Jonathan Cameron
2026-05-28 22:27     ` Erim, Salih
2026-05-28 12:06 ` [PATCH v3 0/5] iio: adc: add AMD/Xilinx Versal SysMon driver Jonathan Cameron
2026-05-28 21:46   ` Erim, Salih
2026-05-29  9:03     ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260527133135.C1ED11F000E9@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=robh@kernel.org \
    --cc=salih.erim@amd.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox