linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations
@ 2025-09-01 16:03 Mohammad Amin Hosseini
  2025-09-01 17:00 ` David Lechner
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mohammad Amin Hosseini @ 2025-09-01 16:03 UTC (permalink / raw)
  To: linux-iio, linux-staging
  Cc: linux-kernel, gregkh, jic23, lars, Michael.Hennerich, dlechner,
	nuno.sa, andy, sonic.zhang, vapier, dan.carpenter,
	Mohammad Amin Hosseini

The ad7816 driver lacks proper synchronization around SPI operations
and device state access. Concurrent access from multiple threads can
lead to data corruption and inconsistent device state.

The driver performs sequences of GPIO pin manipulations followed by
SPI transactions without any locking. Device state variables (mode,
channel_id, oti_data) are also accessed without synchronization.

This bug was found through manual code review using static analysis
techniques. The review focused on identifying unsynchronized access
patterns to shared resources. Key indicators were:
- GPIO pin state changes followed by SPI operations without atomicity
- Shared state variables accessed from multiple sysfs entry points
- No mutex or spinlock protection around sections
- Potential for interleaved execution in multi-threaded environments

The review methodology involved tracing data flow paths and identifying
points where concurrent access could corrupt device state or SPI
communication sequences.

Add io_lock mutex to protect:
- SPI transactions and GPIO sequences in read/write functions
- Device state variables in sysfs show/store functions
- Concurrent access to chip configuration

This prevents race conditions when multiple processes access the device
simultaneously through sysfs attributes or device file operations.

Fixes: 7924425db04a ("staging: iio: adc: new driver for AD7816 devices")

Signed-off-by: Mohammad Amin Hosseini <moahmmad.hosseinii@gmail.com>

---
Changes in v4:
- Added locking to reader functions (show_mode, show_channel, show_oti)
- Fixed incomplete reader/writer synchronization that could still race
- Ensured all device state access is properly synchronized
- Replace sprintf() with sysfs_emit() in all sysfs show functions
- Use sysfs_streq() instead of strcmp() for proper input parsing
- Implement locked/unlocked SPI function variants to prevent deadlock
- Use channel snapshot to ensure atomic read operations
- Fix sizeof() usage in spi_read to be more explicit (sizeof(buf))
- Make oti write operations atomic (SPI write + shadow update under lock)
- Fix race condition in ad7816_set_oti() by taking channel_id snapshot under lock
- Fix return type consistency (ssize_t vs int) in show functions
- Use chip->id instead of string comparison for channel validation
- Add explicit cast for narrowing assignment
- Add default case for unknown chip ID validation
- Use cansleep GPIO variants in sleepable context
- Improve lock documentation for protected resources
---
 drivers/staging/iio/adc/ad7816.c | 210 ++++++++++++++++++++-----------
 1 file changed, 138 insertions(+), 72 deletions(-)

diff --git a/drivers/staging/iio/adc/ad7816.c b/drivers/staging/iio/adc/ad7816.c
index 4774df778de9..49a67e1b76f6 100644
--- a/drivers/staging/iio/adc/ad7816.c
+++ b/drivers/staging/iio/adc/ad7816.c
@@ -50,6 +50,15 @@ struct ad7816_chip_info {
 	u8  oti_data[AD7816_CS_MAX + 1];
 	u8  channel_id;	/* 0 always be temperature */
 	u8  mode;
+  /*
+   * Protects:
+   *  - SPI transactions
+   *  - GPIO toggling
+   *  - channel_id
+   *  - mode
+   *  - oti_data
+   */
+	struct mutex io_lock;
 };
 
 enum ad7816_type {
@@ -59,60 +68,71 @@ enum ad7816_type {
 };
 
 /*
- * ad7816 data access by SPI
+ * ad7816 data access by SPI - locked versions assume io_lock is held
  */
-static int ad7816_spi_read(struct ad7816_chip_info *chip, u16 *data)
+static int ad7816_spi_read_locked(struct ad7816_chip_info *chip, u8 channel, u16 *data)
 {
 	struct spi_device *spi_dev = chip->spi_dev;
 	int ret;
 	__be16 buf;
 
-	gpiod_set_value(chip->rdwr_pin, 1);
-	gpiod_set_value(chip->rdwr_pin, 0);
-	ret = spi_write(spi_dev, &chip->channel_id, sizeof(chip->channel_id));
-	if (ret < 0) {
-		dev_err(&spi_dev->dev, "SPI channel setting error\n");
+	gpiod_set_value_cansleep(chip->rdwr_pin, 1);
+	gpiod_set_value_cansleep(chip->rdwr_pin, 0);
+	/* AD7816_CS_MASK: broadcast/all-channels per hw programming model */
+	ret = spi_write(spi_dev, &channel, sizeof(channel));
+	if (ret < 0)
 		return ret;
-	}
-	gpiod_set_value(chip->rdwr_pin, 1);
+	gpiod_set_value_cansleep(chip->rdwr_pin, 1);
 
 	if (chip->mode == AD7816_PD) { /* operating mode 2 */
-		gpiod_set_value(chip->convert_pin, 1);
-		gpiod_set_value(chip->convert_pin, 0);
+		gpiod_set_value_cansleep(chip->convert_pin, 1);
+		gpiod_set_value_cansleep(chip->convert_pin, 0);
 	} else { /* operating mode 1 */
-		gpiod_set_value(chip->convert_pin, 0);
-		gpiod_set_value(chip->convert_pin, 1);
+		gpiod_set_value_cansleep(chip->convert_pin, 0);
+		gpiod_set_value_cansleep(chip->convert_pin, 1);
 	}
 
 	if (chip->id == ID_AD7816 || chip->id == ID_AD7817) {
-		while (gpiod_get_value(chip->busy_pin))
+		while (gpiod_get_value_cansleep(chip->busy_pin))
 			cpu_relax();
 	}
 
-	gpiod_set_value(chip->rdwr_pin, 0);
-	gpiod_set_value(chip->rdwr_pin, 1);
-	ret = spi_read(spi_dev, &buf, sizeof(*data));
-	if (ret < 0) {
-		dev_err(&spi_dev->dev, "SPI data read error\n");
+	gpiod_set_value_cansleep(chip->rdwr_pin, 0);
+	gpiod_set_value_cansleep(chip->rdwr_pin, 1);
+	ret = spi_read(spi_dev, &buf, sizeof(buf));
+	if (ret < 0)
 		return ret;
-	}
 
 	*data = be16_to_cpu(buf);
+	return 0;
+}
 
+static int __maybe_unused ad7816_spi_read(struct ad7816_chip_info *chip, u8 channel, u16 *data)
+{
+	int ret;
+
+	mutex_lock(&chip->io_lock);
+	ret = ad7816_spi_read_locked(chip, channel, data);
+	mutex_unlock(&chip->io_lock);
 	return ret;
 }
 
-static int ad7816_spi_write(struct ad7816_chip_info *chip, u8 data)
+static int ad7816_spi_write_locked(struct ad7816_chip_info *chip, u8 data)
 {
 	struct spi_device *spi_dev = chip->spi_dev;
-	int ret;
 
-	gpiod_set_value(chip->rdwr_pin, 1);
-	gpiod_set_value(chip->rdwr_pin, 0);
-	ret = spi_write(spi_dev, &data, sizeof(data));
-	if (ret < 0)
-		dev_err(&spi_dev->dev, "SPI oti data write error\n");
+	gpiod_set_value_cansleep(chip->rdwr_pin, 1);
+	gpiod_set_value_cansleep(chip->rdwr_pin, 0);
+	return spi_write(spi_dev, &data, sizeof(data));
+}
+
+static int __maybe_unused ad7816_spi_write(struct ad7816_chip_info *chip, u8 data)
+{
+	int ret;
 
+	mutex_lock(&chip->io_lock);
+	ret = ad7816_spi_write_locked(chip, data);
+	mutex_unlock(&chip->io_lock);
 	return ret;
 }
 
@@ -122,10 +142,13 @@ static ssize_t ad7816_show_mode(struct device *dev,
 {
 	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
 	struct ad7816_chip_info *chip = iio_priv(indio_dev);
+	ssize_t ret;
+
+	mutex_lock(&chip->io_lock);
+	ret = sysfs_emit(buf, "%s\n", chip->mode ? "power-save" : "full");
+	mutex_unlock(&chip->io_lock);
 
-	if (chip->mode)
-		return sprintf(buf, "power-save\n");
-	return sprintf(buf, "full\n");
+	return ret;
 }
 
 static ssize_t ad7816_store_mode(struct device *dev,
@@ -136,13 +159,18 @@ static ssize_t ad7816_store_mode(struct device *dev,
 	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
 	struct ad7816_chip_info *chip = iio_priv(indio_dev);
 
-	if (strcmp(buf, "full") == 0) {
-		gpiod_set_value(chip->rdwr_pin, 1);
+	mutex_lock(&chip->io_lock);
+	if (sysfs_streq(buf, "full")) {
+		gpiod_set_value_cansleep(chip->rdwr_pin, 1);
 		chip->mode = AD7816_FULL;
-	} else {
-		gpiod_set_value(chip->rdwr_pin, 0);
+	} else if (sysfs_streq(buf, "power-save")) {
+		gpiod_set_value_cansleep(chip->rdwr_pin, 0);
 		chip->mode = AD7816_PD;
+	} else {
+		mutex_unlock(&chip->io_lock);
+		return -EINVAL;
 	}
+	mutex_unlock(&chip->io_lock);
 
 	return len;
 }
@@ -156,7 +184,7 @@ static ssize_t ad7816_show_available_modes(struct device *dev,
 					   struct device_attribute *attr,
 					   char *buf)
 {
-	return sprintf(buf, "full\npower-save\n");
+	return sysfs_emit(buf, "full\npower-save\n");
 }
 
 static IIO_DEVICE_ATTR(available_modes, 0444, ad7816_show_available_modes,
@@ -168,8 +196,13 @@ static ssize_t ad7816_show_channel(struct device *dev,
 {
 	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
 	struct ad7816_chip_info *chip = iio_priv(indio_dev);
+	ssize_t ret;
 
-	return sprintf(buf, "%d\n", chip->channel_id);
+	mutex_lock(&chip->io_lock);
+	ret = sysfs_emit(buf, "%u\n", chip->channel_id);
+	mutex_unlock(&chip->io_lock);
+
+	return ret;
 }
 
 static ssize_t ad7816_store_channel(struct device *dev,
@@ -190,17 +223,34 @@ static ssize_t ad7816_store_channel(struct device *dev,
 		dev_err(&chip->spi_dev->dev, "Invalid channel id %lu for %s.\n",
 			data, indio_dev->name);
 		return -EINVAL;
-	} else if (strcmp(indio_dev->name, "ad7818") == 0 && data > 1) {
-		dev_err(&chip->spi_dev->dev,
-			"Invalid channel id %lu for ad7818.\n", data);
-		return -EINVAL;
-	} else if (strcmp(indio_dev->name, "ad7816") == 0 && data > 0) {
-		dev_err(&chip->spi_dev->dev,
-			"Invalid channel id %lu for ad7816.\n", data);
+	}
+
+	switch (chip->id) {
+	case ID_AD7816:
+		if (data > 0) {
+			dev_err(&chip->spi_dev->dev,
+				"Invalid channel id %lu for ad7816.\n", data);
+			return -EINVAL;
+		}
+		break;
+	case ID_AD7818:
+		if (data > 1) {
+			dev_err(&chip->spi_dev->dev,
+				"Invalid channel id %lu for ad7818.\n", data);
+			return -EINVAL;
+		}
+		break;
+	case ID_AD7817:
+		/* AD7817 allows all channels up to AD7816_CS_MAX */
+		break;
+	default:
+		dev_err(&chip->spi_dev->dev, "Unknown chip id %lu\n", chip->id);
 		return -EINVAL;
 	}
 
-	chip->channel_id = data;
+	mutex_lock(&chip->io_lock);
+	chip->channel_id = (u8)data;
+	mutex_unlock(&chip->io_lock);
 
 	return len;
 }
@@ -219,21 +269,25 @@ static ssize_t ad7816_show_value(struct device *dev,
 	u16 data;
 	s8 value;
 	int ret;
+	u8 ch;
 
-	ret = ad7816_spi_read(chip, &data);
+	mutex_lock(&chip->io_lock);
+	ch = chip->channel_id;                                   /* snapshot */
+	ret = ad7816_spi_read_locked(chip, ch, &data);           /* same lock */
+	mutex_unlock(&chip->io_lock);
 	if (ret)
-		return -EIO;
+		return ret;
 
 	data >>= AD7816_VALUE_OFFSET;
 
-	if (chip->channel_id == 0) {
+	if (ch == 0) {
 		value = (s8)((data >> AD7816_TEMP_FLOAT_OFFSET) - 103);
 		data &= AD7816_TEMP_FLOAT_MASK;
 		if (value < 0)
 			data = BIT(AD7816_TEMP_FLOAT_OFFSET) - data;
-		return sprintf(buf, "%d.%.2d\n", value, data * 25);
+		return sysfs_emit(buf, "%d.%.2d\n", value, data * 25);
 	}
-	return sprintf(buf, "%u\n", data);
+	return sysfs_emit(buf, "%u\n", data);
 }
 
 static IIO_DEVICE_ATTR(value, 0444, ad7816_show_value, NULL, 0);
@@ -273,58 +327,69 @@ static ssize_t ad7816_show_oti(struct device *dev,
 	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
 	struct ad7816_chip_info *chip = iio_priv(indio_dev);
 	int value;
+	ssize_t ret;
 
+	mutex_lock(&chip->io_lock);
 	if (chip->channel_id > AD7816_CS_MAX) {
-		dev_err(dev, "Invalid oti channel id %d.\n", chip->channel_id);
+		dev_err(dev, "Invalid oti channel id %u.\n", chip->channel_id);
+		mutex_unlock(&chip->io_lock);
 		return -EINVAL;
 	} else if (chip->channel_id == 0) {
 		value = AD7816_BOUND_VALUE_MIN +
-			(chip->oti_data[chip->channel_id] -
-			AD7816_BOUND_VALUE_BASE);
-		return sprintf(buf, "%d\n", value);
+			(chip->oti_data[chip->channel_id] - AD7816_BOUND_VALUE_BASE);
+		ret = sysfs_emit(buf, "%d\n", value);
+	} else {
+		ret = sysfs_emit(buf, "%u\n", chip->oti_data[chip->channel_id]);
 	}
-	return sprintf(buf, "%u\n", chip->oti_data[chip->channel_id]);
+	mutex_unlock(&chip->io_lock);
+
+	return ret;
 }
 
-static inline ssize_t ad7816_set_oti(struct device *dev,
-				     struct device_attribute *attr,
-				     const char *buf,
-				     size_t len)
+static ssize_t ad7816_set_oti(struct device *dev,
+			      struct device_attribute *attr,
+			      const char *buf,
+			      size_t len)
 {
 	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
 	struct ad7816_chip_info *chip = iio_priv(indio_dev);
 	long value;
-	u8 data;
+	u8 data, ch;
 	int ret;
 
 	ret = kstrtol(buf, 10, &value);
 	if (ret)
 		return ret;
 
-	if (chip->channel_id > AD7816_CS_MAX) {
-		dev_err(dev, "Invalid oti channel id %d.\n", chip->channel_id);
+	mutex_lock(&chip->io_lock);
+	ch = chip->channel_id;
+
+	if (ch > AD7816_CS_MAX) {
+		dev_err(dev, "Invalid oti channel id %u.\n", ch);
+		mutex_unlock(&chip->io_lock);
 		return -EINVAL;
-	} else if (chip->channel_id == 0) {
+	} else if (ch == 0) {
 		if (value < AD7816_BOUND_VALUE_MIN ||
-		    value > AD7816_BOUND_VALUE_MAX)
+		    value > AD7816_BOUND_VALUE_MAX) {
+			mutex_unlock(&chip->io_lock);
 			return -EINVAL;
-
+		}
 		data = (u8)(value - AD7816_BOUND_VALUE_MIN +
 			AD7816_BOUND_VALUE_BASE);
 	} else {
-		if (value < AD7816_BOUND_VALUE_BASE || value > 255)
+		if (value < AD7816_BOUND_VALUE_BASE || value > 255) {
+			mutex_unlock(&chip->io_lock);
 			return -EINVAL;
-
+		}
 		data = (u8)value;
 	}
 
-	ret = ad7816_spi_write(chip, data);
-	if (ret)
-		return -EIO;
+	ret = ad7816_spi_write_locked(chip, data);
+	if (!ret)
+		chip->oti_data[ch] = data;
 
-	chip->oti_data[chip->channel_id] = data;
-
-	return len;
+	mutex_unlock(&chip->io_lock);
+	return ret ? ret : len;
 }
 
 static IIO_DEVICE_ATTR(oti, 0644,
@@ -363,6 +428,7 @@ static int ad7816_probe(struct spi_device *spi_dev)
 	dev_set_drvdata(&spi_dev->dev, indio_dev);
 
 	chip->spi_dev = spi_dev;
+	mutex_init(&chip->io_lock);
 	for (i = 0; i <= AD7816_CS_MAX; i++)
 		chip->oti_data[i] = 203;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations
  2025-09-01 16:03 [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations Mohammad Amin Hosseini
@ 2025-09-01 17:00 ` David Lechner
  2025-09-01 19:46   ` Jonathan Cameron
  2025-09-01 19:00 ` Dan Carpenter
  2025-09-01 20:01 ` Greg KH
  2 siblings, 1 reply; 5+ messages in thread
From: David Lechner @ 2025-09-01 17:00 UTC (permalink / raw)
  To: Mohammad Amin Hosseini, linux-iio, linux-staging
  Cc: linux-kernel, gregkh, jic23, lars, Michael.Hennerich, nuno.sa,
	andy, sonic.zhang, vapier, dan.carpenter

On 9/1/25 11:03 AM, Mohammad Amin Hosseini wrote:
> The ad7816 driver lacks proper synchronization around SPI operations
> and device state access. Concurrent access from multiple threads can
> lead to data corruption and inconsistent device state.
> 
> The driver performs sequences of GPIO pin manipulations followed by
> SPI transactions without any locking. Device state variables (mode,
> channel_id, oti_data) are also accessed without synchronization.
> 
> This bug was found through manual code review using static analysis
> techniques. The review focused on identifying unsynchronized access
> patterns to shared resources. Key indicators were:
> - GPIO pin state changes followed by SPI operations without atomicity
> - Shared state variables accessed from multiple sysfs entry points
> - No mutex or spinlock protection around sections
> - Potential for interleaved execution in multi-threaded environments
> 
> The review methodology involved tracing data flow paths and identifying
> points where concurrent access could corrupt device state or SPI
> communication sequences.
> 
> Add io_lock mutex to protect:
> - SPI transactions and GPIO sequences in read/write functions
> - Device state variables in sysfs show/store functions
> - Concurrent access to chip configuration
> 
> This prevents race conditions when multiple processes access the device
> simultaneously through sysfs attributes or device file operations.
> 
> Fixes: 7924425db04a ("staging: iio: adc: new driver for AD7816 devices")
> 
> Signed-off-by: Mohammad Amin Hosseini <moahmmad.hosseinii@gmail.com>
> 
> ---
> Changes in v4:
> - Added locking to reader functions (show_mode, show_channel, show_oti)
> - Fixed incomplete reader/writer synchronization that could still race
> - Ensured all device state access is properly synchronized
> - Replace sprintf() with sysfs_emit() in all sysfs show functions
> - Use sysfs_streq() instead of strcmp() for proper input parsing
> - Implement locked/unlocked SPI function variants to prevent deadlock
> - Use channel snapshot to ensure atomic read operations
> - Fix sizeof() usage in spi_read to be more explicit (sizeof(buf))
> - Make oti write operations atomic (SPI write + shadow update under lock)
> - Fix race condition in ad7816_set_oti() by taking channel_id snapshot under lock
> - Fix return type consistency (ssize_t vs int) in show functions
> - Use chip->id instead of string comparison for channel validation
> - Add explicit cast for narrowing assignment
> - Add default case for unknown chip ID validation
> - Use cansleep GPIO variants in sleepable context
> - Improve lock documentation for protected resources
> ---

This is way to much to do in a single patch. Also, given that this
part is obsolete [1] and this driver is in staging, is it really
worth all of this effort to fix it up?

[1]: https://www.analog.com/en/products/ad7816.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations
  2025-09-01 16:03 [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations Mohammad Amin Hosseini
  2025-09-01 17:00 ` David Lechner
@ 2025-09-01 19:00 ` Dan Carpenter
  2025-09-01 20:01 ` Greg KH
  2 siblings, 0 replies; 5+ messages in thread
From: Dan Carpenter @ 2025-09-01 19:00 UTC (permalink / raw)
  To: Mohammad Amin Hosseini
  Cc: linux-iio, linux-staging, linux-kernel, gregkh, jic23, lars,
	Michael.Hennerich, dlechner, nuno.sa, andy, sonic.zhang, vapier

On Mon, Sep 01, 2025 at 07:33:10PM +0330, Mohammad Amin Hosseini wrote:
> The ad7816 driver lacks proper synchronization around SPI operations
> and device state access. Concurrent access from multiple threads can
> lead to data corruption and inconsistent device state.
> 
> The driver performs sequences of GPIO pin manipulations followed by
> SPI transactions without any locking. Device state variables (mode,
> channel_id, oti_data) are also accessed without synchronization.
> 
> This bug was found through manual code review using static analysis
> techniques. The review focused on identifying unsynchronized access
> patterns to shared resources. Key indicators were:
> - GPIO pin state changes followed by SPI operations without atomicity
> - Shared state variables accessed from multiple sysfs entry points
> - No mutex or spinlock protection around sections
> - Potential for interleaved execution in multi-threaded environments
> 
> The review methodology involved tracing data flow paths and identifying
> points where concurrent access could corrupt device state or SPI
> communication sequences.
> 
> Add io_lock mutex to protect:
> - SPI transactions and GPIO sequences in read/write functions
> - Device state variables in sysfs show/store functions
> - Concurrent access to chip configuration
> 
> This prevents race conditions when multiple processes access the device
> simultaneously through sysfs attributes or device file operations.
> 
> Fixes: 7924425db04a ("staging: iio: adc: new driver for AD7816 devices")
> 
> Signed-off-by: Mohammad Amin Hosseini <moahmmad.hosseinii@gmail.com>
> 
> ---
> Changes in v4:
> - Added locking to reader functions (show_mode, show_channel, show_oti)
> - Fixed incomplete reader/writer synchronization that could still race
> - Ensured all device state access is properly synchronized
> - Replace sprintf() with sysfs_emit() in all sysfs show functions
> - Use sysfs_streq() instead of strcmp() for proper input parsing
> - Implement locked/unlocked SPI function variants to prevent deadlock
> - Use channel snapshot to ensure atomic read operations
> - Fix sizeof() usage in spi_read to be more explicit (sizeof(buf))
> - Make oti write operations atomic (SPI write + shadow update under lock)
> - Fix race condition in ad7816_set_oti() by taking channel_id snapshot under lock
> - Fix return type consistency (ssize_t vs int) in show functions
> - Use chip->id instead of string comparison for channel validation
> - Add explicit cast for narrowing assignment
> - Add default case for unknown chip ID validation
> - Use cansleep GPIO variants in sleepable context

Why?  This isn't described in the commit message.

(This is not a rhetorical question and the answer will change how we
approach this).

> - Improve lock documentation for protected resources
> ---
>  drivers/staging/iio/adc/ad7816.c | 210 ++++++++++++++++++++-----------
>  1 file changed, 138 insertions(+), 72 deletions(-)
> 
> diff --git a/drivers/staging/iio/adc/ad7816.c b/drivers/staging/iio/adc/ad7816.c
> index 4774df778de9..49a67e1b76f6 100644
> --- a/drivers/staging/iio/adc/ad7816.c
> +++ b/drivers/staging/iio/adc/ad7816.c
> @@ -50,6 +50,15 @@ struct ad7816_chip_info {
>  	u8  oti_data[AD7816_CS_MAX + 1];
>  	u8  channel_id;	/* 0 always be temperature */
>  	u8  mode;
> +  /*
> +   * Protects:
> +   *  - SPI transactions
> +   *  - GPIO toggling
> +   *  - channel_id
> +   *  - mode
> +   *  - oti_data
> +   */

Indenting is wrong.

> +	struct mutex io_lock;
>  };
>  
>  enum ad7816_type {
> @@ -59,60 +68,71 @@ enum ad7816_type {
>  };
>  
>  /*
> - * ad7816 data access by SPI
> + * ad7816 data access by SPI - locked versions assume io_lock is held
>   */
> -static int ad7816_spi_read(struct ad7816_chip_info *chip, u16 *data)
> +static int ad7816_spi_read_locked(struct ad7816_chip_info *chip, u8 channel, u16 *data)
>  {
>  	struct spi_device *spi_dev = chip->spi_dev;
>  	int ret;
>  	__be16 buf;
>  
> -	gpiod_set_value(chip->rdwr_pin, 1);
> -	gpiod_set_value(chip->rdwr_pin, 0);
> -	ret = spi_write(spi_dev, &chip->channel_id, sizeof(chip->channel_id));
> -	if (ret < 0) {
> -		dev_err(&spi_dev->dev, "SPI channel setting error\n");
> +	gpiod_set_value_cansleep(chip->rdwr_pin, 1);
> +	gpiod_set_value_cansleep(chip->rdwr_pin, 0);
> +	/* AD7816_CS_MASK: broadcast/all-channels per hw programming model */
> +	ret = spi_write(spi_dev, &channel, sizeof(channel));
> +	if (ret < 0)
>  		return ret;
> -	}
> -	gpiod_set_value(chip->rdwr_pin, 1);
> +	gpiod_set_value_cansleep(chip->rdwr_pin, 1);
>  
>  	if (chip->mode == AD7816_PD) { /* operating mode 2 */
> -		gpiod_set_value(chip->convert_pin, 1);
> -		gpiod_set_value(chip->convert_pin, 0);
> +		gpiod_set_value_cansleep(chip->convert_pin, 1);
> +		gpiod_set_value_cansleep(chip->convert_pin, 0);
>  	} else { /* operating mode 1 */
> -		gpiod_set_value(chip->convert_pin, 0);
> -		gpiod_set_value(chip->convert_pin, 1);
> +		gpiod_set_value_cansleep(chip->convert_pin, 0);
> +		gpiod_set_value_cansleep(chip->convert_pin, 1);
>  	}
>  
>  	if (chip->id == ID_AD7816 || chip->id == ID_AD7817) {
> -		while (gpiod_get_value(chip->busy_pin))
> +		while (gpiod_get_value_cansleep(chip->busy_pin))
>  			cpu_relax();
>  	}
>  
> -	gpiod_set_value(chip->rdwr_pin, 0);
> -	gpiod_set_value(chip->rdwr_pin, 1);
> -	ret = spi_read(spi_dev, &buf, sizeof(*data));
> -	if (ret < 0) {
> -		dev_err(&spi_dev->dev, "SPI data read error\n");
> +	gpiod_set_value_cansleep(chip->rdwr_pin, 0);
> +	gpiod_set_value_cansleep(chip->rdwr_pin, 1);
> +	ret = spi_read(spi_dev, &buf, sizeof(buf));
> +	if (ret < 0)
>  		return ret;
> -	}
>  
>  	*data = be16_to_cpu(buf);
> +	return 0;
> +}
>  
> +static int __maybe_unused ad7816_spi_read(struct ad7816_chip_info *chip, u8 channel, u16 *data)
> +{
> +	int ret;
> +
> +	mutex_lock(&chip->io_lock);
> +	ret = ad7816_spi_read_locked(chip, channel, data);
> +	mutex_unlock(&chip->io_lock);
>  	return ret;
>  }
>  
> -static int ad7816_spi_write(struct ad7816_chip_info *chip, u8 data)
> +static int ad7816_spi_write_locked(struct ad7816_chip_info *chip, u8 data)
>  {
>  	struct spi_device *spi_dev = chip->spi_dev;
> -	int ret;
>  
> -	gpiod_set_value(chip->rdwr_pin, 1);
> -	gpiod_set_value(chip->rdwr_pin, 0);
> -	ret = spi_write(spi_dev, &data, sizeof(data));
> -	if (ret < 0)
> -		dev_err(&spi_dev->dev, "SPI oti data write error\n");
> +	gpiod_set_value_cansleep(chip->rdwr_pin, 1);
> +	gpiod_set_value_cansleep(chip->rdwr_pin, 0);
> +	return spi_write(spi_dev, &data, sizeof(data));
> +}
> +
> +static int __maybe_unused ad7816_spi_write(struct ad7816_chip_info *chip, u8 data)
> +{
> +	int ret;
>  
> +	mutex_lock(&chip->io_lock);
> +	ret = ad7816_spi_write_locked(chip, data);
> +	mutex_unlock(&chip->io_lock);
>  	return ret;
>  }
>  
> @@ -122,10 +142,13 @@ static ssize_t ad7816_show_mode(struct device *dev,
>  {
>  	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
>  	struct ad7816_chip_info *chip = iio_priv(indio_dev);
> +	ssize_t ret;
> +
> +	mutex_lock(&chip->io_lock);
> +	ret = sysfs_emit(buf, "%s\n", chip->mode ? "power-save" : "full");
> +	mutex_unlock(&chip->io_lock);

This locking isn't required.  If you're deliberately racing against
yourself and you trick the driver into briefly printing that it's
in power-save mode instead of full then, you know, who cares?  You're
allowed to change the mode so you just pranked your ownself for no
reason.

>  
> -	if (chip->mode)
> -		return sprintf(buf, "power-save\n");
> -	return sprintf(buf, "full\n");
> +	return ret;
>  }
>  
>  static ssize_t ad7816_store_mode(struct device *dev,
> @@ -136,13 +159,18 @@ static ssize_t ad7816_store_mode(struct device *dev,
>  	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
>  	struct ad7816_chip_info *chip = iio_priv(indio_dev);
>  
> -	if (strcmp(buf, "full") == 0) {
> -		gpiod_set_value(chip->rdwr_pin, 1);
> +	mutex_lock(&chip->io_lock);
> +	if (sysfs_streq(buf, "full")) {

Using sysfs_streq() is unrelated.  Don't mix unrelated stuff into the
same patch.

> +		gpiod_set_value_cansleep(chip->rdwr_pin, 1);
>  		chip->mode = AD7816_FULL;
> -	} else {
> -		gpiod_set_value(chip->rdwr_pin, 0);
> +	} else if (sysfs_streq(buf, "power-save")) {
> +		gpiod_set_value_cansleep(chip->rdwr_pin, 0);
>  		chip->mode = AD7816_PD;
> +	} else {
> +		mutex_unlock(&chip->io_lock);
> +		return -EINVAL;
>  	}
> +	mutex_unlock(&chip->io_lock);
>  
>  	return len;
>  }
> @@ -156,7 +184,7 @@ static ssize_t ad7816_show_available_modes(struct device *dev,
>  					   struct device_attribute *attr,
>  					   char *buf)
>  {
> -	return sprintf(buf, "full\npower-save\n");
> +	return sysfs_emit(buf, "full\npower-save\n");

Unrelated.

>  }
>  
>  static IIO_DEVICE_ATTR(available_modes, 0444, ad7816_show_available_modes,
> @@ -168,8 +196,13 @@ static ssize_t ad7816_show_channel(struct device *dev,
>  {
>  	struct iio_dev *indio_dev = dev_to_iio_dev(dev);
>  	struct ad7816_chip_info *chip = iio_priv(indio_dev);
> +	ssize_t ret;
>  
> -	return sprintf(buf, "%d\n", chip->channel_id);
> +	mutex_lock(&chip->io_lock);
> +	ret = sysfs_emit(buf, "%u\n", chip->channel_id);
> +	mutex_unlock(&chip->io_lock);
> +
> +	return ret;
>  }
>  
>  static ssize_t ad7816_store_channel(struct device *dev,
> @@ -190,17 +223,34 @@ static ssize_t ad7816_store_channel(struct device *dev,
>  		dev_err(&chip->spi_dev->dev, "Invalid channel id %lu for %s.\n",
>  			data, indio_dev->name);
>  		return -EINVAL;
> -	} else if (strcmp(indio_dev->name, "ad7818") == 0 && data > 1) {
> -		dev_err(&chip->spi_dev->dev,
> -			"Invalid channel id %lu for ad7818.\n", data);
> -		return -EINVAL;
> -	} else if (strcmp(indio_dev->name, "ad7816") == 0 && data > 0) {
> -		dev_err(&chip->spi_dev->dev,
> -			"Invalid channel id %lu for ad7816.\n", data);
> +	}
> +
> +	switch (chip->id) {
> +	case ID_AD7816:
> +		if (data > 0) {
> +			dev_err(&chip->spi_dev->dev,
> +				"Invalid channel id %lu for ad7816.\n", data);
> +			return -EINVAL;
> +		}
> +		break;
> +	case ID_AD7818:
> +		if (data > 1) {
> +			dev_err(&chip->spi_dev->dev,
> +				"Invalid channel id %lu for ad7818.\n", data);
> +			return -EINVAL;
> +		}
> +		break;
> +	case ID_AD7817:
> +		/* AD7817 allows all channels up to AD7816_CS_MAX */
> +		break;
> +	default:
> +		dev_err(&chip->spi_dev->dev, "Unknown chip id %lu\n", chip->id);
>  		return -EINVAL;
>  	}
>  
> -	chip->channel_id = data;
> +	mutex_lock(&chip->io_lock);
> +	chip->channel_id = (u8)data;

There are way too many unrelated things happening and a bunch of them are
controversial in ways that you aren't expecting.  For example, a lot of
people (me) hate pointless casts...  We already have bounds checked data
so there is no need to cast.  If we wanted to we could declare data as a
u8, and that would also be pointless since we have to bounds check it
anyway but at least it's better than casting.

Version 3 wasn't very far off from being correct and in version 4 you
fixed the remaining locking issues I saw.  But the patch does way too
many other things too.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations
  2025-09-01 17:00 ` David Lechner
@ 2025-09-01 19:46   ` Jonathan Cameron
  0 siblings, 0 replies; 5+ messages in thread
From: Jonathan Cameron @ 2025-09-01 19:46 UTC (permalink / raw)
  To: David Lechner
  Cc: Mohammad Amin Hosseini, linux-iio, linux-staging, linux-kernel,
	gregkh, lars, Michael.Hennerich, nuno.sa, andy, sonic.zhang,
	vapier, dan.carpenter

On Mon, 1 Sep 2025 12:00:44 -0500
David Lechner <dlechner@baylibre.com> wrote:

> On 9/1/25 11:03 AM, Mohammad Amin Hosseini wrote:
> > The ad7816 driver lacks proper synchronization around SPI operations
> > and device state access. Concurrent access from multiple threads can
> > lead to data corruption and inconsistent device state.
> > 
> > The driver performs sequences of GPIO pin manipulations followed by
> > SPI transactions without any locking. Device state variables (mode,
> > channel_id, oti_data) are also accessed without synchronization.
> > 
> > This bug was found through manual code review using static analysis
> > techniques. The review focused on identifying unsynchronized access
> > patterns to shared resources. Key indicators were:
> > - GPIO pin state changes followed by SPI operations without atomicity
> > - Shared state variables accessed from multiple sysfs entry points
> > - No mutex or spinlock protection around sections
> > - Potential for interleaved execution in multi-threaded environments
> > 
> > The review methodology involved tracing data flow paths and identifying
> > points where concurrent access could corrupt device state or SPI
> > communication sequences.
> > 
> > Add io_lock mutex to protect:
> > - SPI transactions and GPIO sequences in read/write functions
> > - Device state variables in sysfs show/store functions
> > - Concurrent access to chip configuration
> > 
> > This prevents race conditions when multiple processes access the device
> > simultaneously through sysfs attributes or device file operations.
> > 
> > Fixes: 7924425db04a ("staging: iio: adc: new driver for AD7816 devices")
> > 
> > Signed-off-by: Mohammad Amin Hosseini <moahmmad.hosseinii@gmail.com>
> > 
> > ---
> > Changes in v4:
> > - Added locking to reader functions (show_mode, show_channel, show_oti)
> > - Fixed incomplete reader/writer synchronization that could still race
> > - Ensured all device state access is properly synchronized
> > - Replace sprintf() with sysfs_emit() in all sysfs show functions
> > - Use sysfs_streq() instead of strcmp() for proper input parsing
> > - Implement locked/unlocked SPI function variants to prevent deadlock
> > - Use channel snapshot to ensure atomic read operations
> > - Fix sizeof() usage in spi_read to be more explicit (sizeof(buf))
> > - Make oti write operations atomic (SPI write + shadow update under lock)
> > - Fix race condition in ad7816_set_oti() by taking channel_id snapshot under lock
> > - Fix return type consistency (ssize_t vs int) in show functions
> > - Use chip->id instead of string comparison for channel validation
> > - Add explicit cast for narrowing assignment
> > - Add default case for unknown chip ID validation
> > - Use cansleep GPIO variants in sleepable context
> > - Improve lock documentation for protected resources
> > ---  
> 
> This is way to much to do in a single patch. Also, given that this
> part is obsolete [1] and this driver is in staging, is it really
> worth all of this effort to fix it up?
> 
> [1]: https://www.analog.com/en/products/ad7816.html

Sadly not obsolete (or at least not all of the supported parts).
I checked these out the other day - the ad7817 is a production part.

Biggest issue here is slow down!  Too may versions, without
time for thorough review before another one turns up.  Aim for
at least a few days, or a 1 week between versions.

Jonathan

> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations
  2025-09-01 16:03 [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations Mohammad Amin Hosseini
  2025-09-01 17:00 ` David Lechner
  2025-09-01 19:00 ` Dan Carpenter
@ 2025-09-01 20:01 ` Greg KH
  2 siblings, 0 replies; 5+ messages in thread
From: Greg KH @ 2025-09-01 20:01 UTC (permalink / raw)
  To: Mohammad Amin Hosseini
  Cc: linux-iio, linux-staging, linux-kernel, jic23, lars,
	Michael.Hennerich, dlechner, nuno.sa, andy, sonic.zhang, vapier,
	dan.carpenter

On Mon, Sep 01, 2025 at 07:33:10PM +0330, Mohammad Amin Hosseini wrote:
> The ad7816 driver lacks proper synchronization around SPI operations
> and device state access. Concurrent access from multiple threads can
> lead to data corruption and inconsistent device state.
> 
> The driver performs sequences of GPIO pin manipulations followed by
> SPI transactions without any locking. Device state variables (mode,
> channel_id, oti_data) are also accessed without synchronization.
> 
> This bug was found through manual code review using static analysis
> techniques. The review focused on identifying unsynchronized access
> patterns to shared resources. Key indicators were:
> - GPIO pin state changes followed by SPI operations without atomicity
> - Shared state variables accessed from multiple sysfs entry points
> - No mutex or spinlock protection around sections
> - Potential for interleaved execution in multi-threaded environments
> 
> The review methodology involved tracing data flow paths and identifying
> points where concurrent access could corrupt device state or SPI
> communication sequences.
> 
> Add io_lock mutex to protect:
> - SPI transactions and GPIO sequences in read/write functions
> - Device state variables in sysfs show/store functions
> - Concurrent access to chip configuration
> 
> This prevents race conditions when multiple processes access the device
> simultaneously through sysfs attributes or device file operations.
> 
> Fixes: 7924425db04a ("staging: iio: adc: new driver for AD7816 devices")
> 
> Signed-off-by: Mohammad Amin Hosseini <moahmmad.hosseinii@gmail.com>
> 
> ---
> Changes in v4:
> - Added locking to reader functions (show_mode, show_channel, show_oti)
> - Fixed incomplete reader/writer synchronization that could still race
> - Ensured all device state access is properly synchronized
> - Replace sprintf() with sysfs_emit() in all sysfs show functions
> - Use sysfs_streq() instead of strcmp() for proper input parsing
> - Implement locked/unlocked SPI function variants to prevent deadlock
> - Use channel snapshot to ensure atomic read operations
> - Fix sizeof() usage in spi_read to be more explicit (sizeof(buf))
> - Make oti write operations atomic (SPI write + shadow update under lock)
> - Fix race condition in ad7816_set_oti() by taking channel_id snapshot under lock
> - Fix return type consistency (ssize_t vs int) in show functions
> - Use chip->id instead of string comparison for channel validation
> - Add explicit cast for narrowing assignment
> - Add default case for unknown chip ID validation
> - Use cansleep GPIO variants in sleepable context
> - Improve lock documentation for protected resources
> ---
>  drivers/staging/iio/adc/ad7816.c | 210 ++++++++++++++++++++-----------
>  1 file changed, 138 insertions(+), 72 deletions(-)

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- Your patch did many different things all at once, making it difficult
  to review.  All Linux kernel patches need to only do one thing at a
  time.  If you need to do multiple things (such as clean up all coding
  style issues in a file/driver), do it in a sequence of patches, each
  one doing only one thing.  This will make it easier to review the
  patches to ensure that they are correct, and to help alleviate any
  merge issues that larger patches can cause.

- You have marked a patch with a "Fixes:" tag for a commit that is in an
  older released kernel, yet you do not have a cc: stable line in the
  signed-off-by area at all, which means that the patch will not be
  applied to any older kernel releases.  To properly fix this, please
  follow the documented rules in the
  Documentation/process/stable-kernel-rules.rst file for how to resolve
  this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-09-01 20:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-01 16:03 [PATCH v4] staging: iio: adc: ad7816: fix race condition in SPI operations Mohammad Amin Hosseini
2025-09-01 17:00 ` David Lechner
2025-09-01 19:46   ` Jonathan Cameron
2025-09-01 19:00 ` Dan Carpenter
2025-09-01 20:01 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).