public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list
@ 2024-01-02 15:59 Bartosz Golaszewski
  2024-01-02 15:59 ` [PATCH v2 1/3] gpiolib: remove the GPIO device from the list when it's unregistered Bartosz Golaszewski
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Bartosz Golaszewski @ 2024-01-02 15:59 UTC (permalink / raw)
  To: Linus Walleij; +Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

I'm still figuring out how to keep GPIO descriptors coherent while
(mostly) lockless. In the meantime, I found a potential race-condition
during GPIO descriptor lookup and also figured that the correct way to
protect the GPIO device list is actually a read-write semaphore as we're
not modifying the list very often and readers should be able to iterate
over it concurrently.

The first patch in this series is new in v2. I realized that we must not
wait until .release() to remove the GPIO device from the list as this is
why pinning down the GPIO device list during lookup would never work -
we always could end up re-taking a reference to an object that was being
released if it got looked up between when the last reference is dropped
and the object is finally removed from the device list.

v1 -> v2:
- add patch 1/3 to fix a release timing issue

Bartosz Golaszewski (3):
  gpiolib: remove the GPIO device from the list when it's unregistered
  gpiolib: replace the GPIO device mutex with a read-write semaphore
  gpiolib: pin GPIO devices in place during descriptor lookup

 drivers/gpio/gpiolib-sysfs.c |  2 +-
 drivers/gpio/gpiolib.c       | 62 ++++++++++++++++++++----------------
 drivers/gpio/gpiolib.h       |  2 +-
 3 files changed, 36 insertions(+), 30 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/3] gpiolib: remove the GPIO device from the list when it's unregistered
  2024-01-02 15:59 [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list Bartosz Golaszewski
@ 2024-01-02 15:59 ` Bartosz Golaszewski
  2024-01-02 22:11   ` Linus Walleij
  2024-01-02 15:59 ` [PATCH v2 2/3] gpiolib: replace the GPIO device mutex with a read-write semaphore Bartosz Golaszewski
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Bartosz Golaszewski @ 2024-01-02 15:59 UTC (permalink / raw)
  To: Linus Walleij; +Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

If we wait until the GPIO device's .release() callback gets invoked
before we remove it from the global device list, then we risk that
someone will look it up using gpio_device_find() between where we
dropped the last reference and before .release() is done taking a
reference again to an object that's being released.

The device must be removed when it's being unregistered - just like how
we remove it from the GPIO bus.

Fixes: ff2b13592299 ("gpio: make the gpiochip a real device")
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/gpio/gpiolib.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index e21497b989a1..e019c4243809 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -651,9 +651,6 @@ static void gpiodev_release(struct device *dev)
 {
 	struct gpio_device *gdev = to_gpio_device(dev);
 
-	scoped_guard(mutex, &gpio_devices_lock)
-		list_del(&gdev->list);
-
 	ida_free(&gpio_ida, gdev->id);
 	kfree_const(gdev->label);
 	kfree(gdev->descs);
@@ -1068,6 +1065,9 @@ void gpiochip_remove(struct gpio_chip *gc)
 		dev_crit(&gdev->dev,
 			 "REMOVING GPIOCHIP WITH GPIOS STILL REQUESTED\n");
 
+	scoped_guard(mutex, &gpio_devices_lock)
+		list_del(&gdev->list);
+
 	/*
 	 * The gpiochip side puts its use of the device to rest here:
 	 * if there are no userspace clients, the chardev and device will
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/3] gpiolib: replace the GPIO device mutex with a read-write semaphore
  2024-01-02 15:59 [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list Bartosz Golaszewski
  2024-01-02 15:59 ` [PATCH v2 1/3] gpiolib: remove the GPIO device from the list when it's unregistered Bartosz Golaszewski
@ 2024-01-02 15:59 ` Bartosz Golaszewski
  2024-01-02 22:12   ` Linus Walleij
  2024-01-02 15:59 ` [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup Bartosz Golaszewski
  2024-01-04  9:39 ` [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list Bartosz Golaszewski
  3 siblings, 1 reply; 10+ messages in thread
From: Bartosz Golaszewski @ 2024-01-02 15:59 UTC (permalink / raw)
  To: Linus Walleij; +Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

There are only two spots where we modify (add to or remove objects from)
the GPIO device list. Readers should be able to access it concurrently.
Replace the mutex with a read-write semaphore and adjust the locking
operations accordingly.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/gpio/gpiolib-sysfs.c |  2 +-
 drivers/gpio/gpiolib.c       | 18 +++++++++---------
 drivers/gpio/gpiolib.h       |  2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpio/gpiolib-sysfs.c b/drivers/gpio/gpiolib-sysfs.c
index 2d29ae37d953..4dbf298bb5dd 100644
--- a/drivers/gpio/gpiolib-sysfs.c
+++ b/drivers/gpio/gpiolib-sysfs.c
@@ -773,7 +773,7 @@ int gpiochip_sysfs_register_all(void)
 	struct gpio_device *gdev;
 	int ret;
 
-	guard(mutex)(&gpio_devices_lock);
+	guard(rwsem_read)(&gpio_devices_sem);
 
 	list_for_each_entry(gdev, &gpio_devices, list) {
 		if (gdev->mockdev)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index e019c4243809..4c93cf73a826 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -85,7 +85,7 @@ static DEFINE_MUTEX(gpio_lookup_lock);
 static LIST_HEAD(gpio_lookup_list);
 
 LIST_HEAD(gpio_devices);
-DEFINE_MUTEX(gpio_devices_lock);
+DECLARE_RWSEM(gpio_devices_sem);
 
 static DEFINE_MUTEX(gpio_machine_hogs_mutex);
 static LIST_HEAD(gpio_machine_hogs);
@@ -118,7 +118,7 @@ struct gpio_desc *gpio_to_desc(unsigned gpio)
 {
 	struct gpio_device *gdev;
 
-	scoped_guard(mutex, &gpio_devices_lock) {
+	scoped_guard(rwsem_read, &gpio_devices_sem) {
 		list_for_each_entry(gdev, &gpio_devices, list) {
 			if (gdev->base <= gpio &&
 			    gdev->base + gdev->ngpio > gpio)
@@ -402,7 +402,7 @@ static struct gpio_desc *gpio_name_to_desc(const char * const name)
 	if (!name)
 		return NULL;
 
-	guard(mutex)(&gpio_devices_lock);
+	guard(rwsem_read)(&gpio_devices_sem);
 
 	list_for_each_entry(gdev, &gpio_devices, list) {
 		struct gpio_desc *desc;
@@ -871,7 +871,7 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 
 	gdev->ngpio = gc->ngpio;
 
-	scoped_guard(mutex, &gpio_devices_lock) {
+	scoped_guard(rwsem_write, &gpio_devices_sem) {
 		/*
 		 * TODO: this allocates a Linux GPIO number base in the global
 		 * GPIO numberspace for this chip. In the long run we want to
@@ -1001,7 +1001,7 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 		goto err_print_message;
 	}
 err_remove_from_list:
-	scoped_guard(mutex, &gpio_devices_lock)
+	scoped_guard(rwsem_write, &gpio_devices_sem)
 		list_del(&gdev->list);
 err_free_label:
 	kfree_const(gdev->label);
@@ -1065,7 +1065,7 @@ void gpiochip_remove(struct gpio_chip *gc)
 		dev_crit(&gdev->dev,
 			 "REMOVING GPIOCHIP WITH GPIOS STILL REQUESTED\n");
 
-	scoped_guard(mutex, &gpio_devices_lock)
+	scoped_guard(rwsem_write, &gpio_devices_sem)
 		list_del(&gdev->list);
 
 	/*
@@ -1114,7 +1114,7 @@ struct gpio_device *gpio_device_find(void *data,
 	 */
 	might_sleep();
 
-	guard(mutex)(&gpio_devices_lock);
+	guard(rwsem_read)(&gpio_devices_sem);
 
 	list_for_each_entry(gdev, &gpio_devices, list) {
 		if (gdev->chip && match(gdev->chip, data))
@@ -4730,7 +4730,7 @@ static void *gpiolib_seq_start(struct seq_file *s, loff_t *pos)
 
 	s->private = "";
 
-	guard(mutex)(&gpio_devices_lock);
+	guard(rwsem_read)(&gpio_devices_sem);
 
 	list_for_each_entry(gdev, &gpio_devices, list) {
 		if (index-- == 0)
@@ -4745,7 +4745,7 @@ static void *gpiolib_seq_next(struct seq_file *s, void *v, loff_t *pos)
 	struct gpio_device *gdev = v;
 	void *ret = NULL;
 
-	scoped_guard(mutex, &gpio_devices_lock) {
+	scoped_guard(rwsem_read, &gpio_devices_sem) {
 		if (list_is_last(&gdev->list, &gpio_devices))
 			ret = NULL;
 		else
diff --git a/drivers/gpio/gpiolib.h b/drivers/gpio/gpiolib.h
index 0ce7451a6b24..97df54abf57a 100644
--- a/drivers/gpio/gpiolib.h
+++ b/drivers/gpio/gpiolib.h
@@ -137,7 +137,7 @@ int gpiod_set_transitory(struct gpio_desc *desc, bool transitory);
 
 extern spinlock_t gpio_lock;
 extern struct list_head gpio_devices;
-extern struct mutex gpio_devices_lock;
+extern struct rw_semaphore gpio_devices_sem;
 
 void gpiod_line_state_notify(struct gpio_desc *desc, unsigned long action);
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup
  2024-01-02 15:59 [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list Bartosz Golaszewski
  2024-01-02 15:59 ` [PATCH v2 1/3] gpiolib: remove the GPIO device from the list when it's unregistered Bartosz Golaszewski
  2024-01-02 15:59 ` [PATCH v2 2/3] gpiolib: replace the GPIO device mutex with a read-write semaphore Bartosz Golaszewski
@ 2024-01-02 15:59 ` Bartosz Golaszewski
  2024-01-02 22:15   ` Linus Walleij
  2024-01-08 13:03   ` Marek Szyprowski
  2024-01-04  9:39 ` [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list Bartosz Golaszewski
  3 siblings, 2 replies; 10+ messages in thread
From: Bartosz Golaszewski @ 2024-01-02 15:59 UTC (permalink / raw)
  To: Linus Walleij; +Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

There's time between when we locate the relevant descriptor during
lookup and when we actually take the reference to its parent GPIO
device where - if the GPIO device in question is removed - we'll end up
with a dangling pointer to freed memory. Make sure devices cannot be
removed until we hold a new reference to the device.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/gpio/gpiolib.c | 40 +++++++++++++++++++++++-----------------
 1 file changed, 23 insertions(+), 17 deletions(-)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 4c93cf73a826..be57f8d6aeae 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -4134,27 +4134,33 @@ static struct gpio_desc *gpiod_find_and_request(struct device *consumer,
 	struct gpio_desc *desc;
 	int ret;
 
-	desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx, &flags, &lookupflags);
-	if (gpiod_not_found(desc) && platform_lookup_allowed) {
+	scoped_guard(rwsem_read, &gpio_devices_sem) {
+		desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx,
+					    &flags, &lookupflags);
+		if (gpiod_not_found(desc) && platform_lookup_allowed) {
+			/*
+			 * Either we are not using DT or ACPI, or their lookup
+			 * did not return a result. In that case, use platform
+			 * lookup as a fallback.
+			 */
+			dev_dbg(consumer,
+				"using lookup tables for GPIO lookup\n");
+			desc = gpiod_find(consumer, con_id, idx, &lookupflags);
+		}
+
+		if (IS_ERR(desc)) {
+			dev_dbg(consumer, "No GPIO consumer %s found\n",
+				con_id);
+			return desc;
+		}
+
 		/*
-		 * Either we are not using DT or ACPI, or their lookup did not
-		 * return a result. In that case, use platform lookup as a
-		 * fallback.
+		 * If a connection label was passed use that, else attempt to
+		 * use the device name as label
 		 */
-		dev_dbg(consumer, "using lookup tables for GPIO lookup\n");
-		desc = gpiod_find(consumer, con_id, idx, &lookupflags);
+		ret = gpiod_request(desc, label);
 	}
 
-	if (IS_ERR(desc)) {
-		dev_dbg(consumer, "No GPIO consumer %s found\n", con_id);
-		return desc;
-	}
-
-	/*
-	 * If a connection label was passed use that, else attempt to use
-	 * the device name as label
-	 */
-	ret = gpiod_request(desc, label);
 	if (ret) {
 		if (!(ret == -EBUSY && flags & GPIOD_FLAGS_BIT_NONEXCLUSIVE))
 			return ERR_PTR(ret);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/3] gpiolib: remove the GPIO device from the list when it's unregistered
  2024-01-02 15:59 ` [PATCH v2 1/3] gpiolib: remove the GPIO device from the list when it's unregistered Bartosz Golaszewski
@ 2024-01-02 22:11   ` Linus Walleij
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Walleij @ 2024-01-02 22:11 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

On Tue, Jan 2, 2024 at 4:59 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:

> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> If we wait until the GPIO device's .release() callback gets invoked
> before we remove it from the global device list, then we risk that
> someone will look it up using gpio_device_find() between where we
> dropped the last reference and before .release() is done taking a
> reference again to an object that's being released.
>
> The device must be removed when it's being unregistered - just like how
> we remove it from the GPIO bus.
>
> Fixes: ff2b13592299 ("gpio: make the gpiochip a real device")
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Makes sense!
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/3] gpiolib: replace the GPIO device mutex with a read-write semaphore
  2024-01-02 15:59 ` [PATCH v2 2/3] gpiolib: replace the GPIO device mutex with a read-write semaphore Bartosz Golaszewski
@ 2024-01-02 22:12   ` Linus Walleij
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Walleij @ 2024-01-02 22:12 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

On Tue, Jan 2, 2024 at 4:59 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:

> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> There are only two spots where we modify (add to or remove objects from)
> the GPIO device list. Readers should be able to access it concurrently.
> Replace the mutex with a read-write semaphore and adjust the locking
> operations accordingly.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Looks correct to me.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup
  2024-01-02 15:59 ` [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup Bartosz Golaszewski
@ 2024-01-02 22:15   ` Linus Walleij
  2024-01-08 13:03   ` Marek Szyprowski
  1 sibling, 0 replies; 10+ messages in thread
From: Linus Walleij @ 2024-01-02 22:15 UTC (permalink / raw)
  To: Bartosz Golaszewski; +Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

On Tue, Jan 2, 2024 at 4:59 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:

> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> There's time between when we locate the relevant descriptor during
> lookup and when we actually take the reference to its parent GPIO
> device where - if the GPIO device in question is removed - we'll end up
> with a dangling pointer to freed memory. Make sure devices cannot be
> removed until we hold a new reference to the device.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Also looks right to me!
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list
  2024-01-02 15:59 [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list Bartosz Golaszewski
                   ` (2 preceding siblings ...)
  2024-01-02 15:59 ` [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup Bartosz Golaszewski
@ 2024-01-04  9:39 ` Bartosz Golaszewski
  3 siblings, 0 replies; 10+ messages in thread
From: Bartosz Golaszewski @ 2024-01-04  9:39 UTC (permalink / raw)
  To: Linus Walleij; +Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

On Tue, Jan 2, 2024 at 4:59 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> I'm still figuring out how to keep GPIO descriptors coherent while
> (mostly) lockless. In the meantime, I found a potential race-condition
> during GPIO descriptor lookup and also figured that the correct way to
> protect the GPIO device list is actually a read-write semaphore as we're
> not modifying the list very often and readers should be able to iterate
> over it concurrently.
>
> The first patch in this series is new in v2. I realized that we must not
> wait until .release() to remove the GPIO device from the list as this is
> why pinning down the GPIO device list during lookup would never work -
> we always could end up re-taking a reference to an object that was being
> released if it got looked up between when the last reference is dropped
> and the object is finally removed from the device list.
>
> v1 -> v2:
> - add patch 1/3 to fix a release timing issue
>
> Bartosz Golaszewski (3):
>   gpiolib: remove the GPIO device from the list when it's unregistered
>   gpiolib: replace the GPIO device mutex with a read-write semaphore
>   gpiolib: pin GPIO devices in place during descriptor lookup
>
>  drivers/gpio/gpiolib-sysfs.c |  2 +-
>  drivers/gpio/gpiolib.c       | 62 ++++++++++++++++++++----------------
>  drivers/gpio/gpiolib.h       |  2 +-
>  3 files changed, 36 insertions(+), 30 deletions(-)
>
> --
> 2.40.1
>

Series applied.

Bart

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup
  2024-01-02 15:59 ` [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup Bartosz Golaszewski
  2024-01-02 22:15   ` Linus Walleij
@ 2024-01-08 13:03   ` Marek Szyprowski
  2024-01-08 15:39     ` Bartosz Golaszewski
  1 sibling, 1 reply; 10+ messages in thread
From: Marek Szyprowski @ 2024-01-08 13:03 UTC (permalink / raw)
  To: Bartosz Golaszewski, Linus Walleij
  Cc: linux-gpio, linux-kernel, Bartosz Golaszewski

On 02.01.2024 16:59, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> There's time between when we locate the relevant descriptor during
> lookup and when we actually take the reference to its parent GPIO
> device where - if the GPIO device in question is removed - we'll end up
> with a dangling pointer to freed memory. Make sure devices cannot be
> removed until we hold a new reference to the device.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

This patch landed in linux-next as commit db660b9a9f86 ("gpiolib: pin 
GPIO devices in place during descriptor lookup"). Unfortunately it 
introduces a following lock-dep warning:

============================================
WARNING: possible recursive locking detected
6.7.0-rc7-00062-gdb660b9a9f86 #7819 Not tainted
--------------------------------------------
kworker/u4:2/27 is trying to acquire lock:
c13f4e1c (gpio_devices_sem){++++}-{3:3}, at: gpio_device_find+0x30/0x94

but task is already holding lock:
c13f4e1c (gpio_devices_sem){++++}-{3:3}, at: 
gpiod_find_and_request+0x44/0x594

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(gpio_devices_sem);
   lock(gpio_devices_sem);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

4 locks held by kworker/u4:2/27:
  #0: c1c06ca8 ((wq_completion)events_unbound){+.+.}-{0:0}, at: 
process_one_work+0x148/0x608
  #1: e093df20 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: 
process_one_work+0x148/0x608
  #2: c1f3048c (&dev->mutex){....}-{3:3}, at: 
__driver_attach_async_helper+0x38/0xec
  #3: c13f4e1c (gpio_devices_sem){++++}-{3:3}, at: 
gpiod_find_and_request+0x44/0x594

stack backtrace:
CPU: 0 PID: 27 Comm: kworker/u4:2 Not tainted 
6.7.0-rc7-00062-gdb660b9a9f86 #7819
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events_unbound async_run_entry_fn
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __lock_acquire+0x1300/0x2984
  __lock_acquire from lock_acquire+0x130/0x37c
  lock_acquire from down_read+0x44/0x224
  down_read from gpio_device_find+0x30/0x94
  gpio_device_find from of_get_named_gpiod_flags+0xa4/0x3a8
  of_get_named_gpiod_flags from of_find_gpio+0x80/0x168
  of_find_gpio from gpiod_find_and_request+0x120/0x594
  gpiod_find_and_request from gpiod_get_optional+0x54/0x90
  gpiod_get_optional from reg_fixed_voltage_probe+0x200/0x400
  reg_fixed_voltage_probe from platform_probe+0x5c/0xb8
  platform_probe from really_probe+0xe0/0x400
  really_probe from __driver_probe_device+0x9c/0x1f0
  __driver_probe_device from driver_probe_device+0x30/0xc0
  driver_probe_device from __driver_attach_async_helper+0x54/0xec
  __driver_attach_async_helper from async_run_entry_fn+0x40/0x154
  async_run_entry_fn from process_one_work+0x204/0x608
  process_one_work from worker_thread+0x1e0/0x498
  worker_thread from kthread+0x104/0x138
  kthread from ret_from_fork+0x14/0x28
Exception stack(0xe093dfb0 to 0xe093dff8)
...


Taking gpio_devices_sem more than once for reading is safe, but it looks 
that it needs some lock-dep annotations to to make it happy and avoid 
the above warning.


> ---
>   drivers/gpio/gpiolib.c | 40 +++++++++++++++++++++++-----------------
>   1 file changed, 23 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 4c93cf73a826..be57f8d6aeae 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -4134,27 +4134,33 @@ static struct gpio_desc *gpiod_find_and_request(struct device *consumer,
>   	struct gpio_desc *desc;
>   	int ret;
>   
> -	desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx, &flags, &lookupflags);
> -	if (gpiod_not_found(desc) && platform_lookup_allowed) {
> +	scoped_guard(rwsem_read, &gpio_devices_sem) {
> +		desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx,
> +					    &flags, &lookupflags);
> +		if (gpiod_not_found(desc) && platform_lookup_allowed) {
> +			/*
> +			 * Either we are not using DT or ACPI, or their lookup
> +			 * did not return a result. In that case, use platform
> +			 * lookup as a fallback.
> +			 */
> +			dev_dbg(consumer,
> +				"using lookup tables for GPIO lookup\n");
> +			desc = gpiod_find(consumer, con_id, idx, &lookupflags);
> +		}
> +
> +		if (IS_ERR(desc)) {
> +			dev_dbg(consumer, "No GPIO consumer %s found\n",
> +				con_id);
> +			return desc;
> +		}
> +
>   		/*
> -		 * Either we are not using DT or ACPI, or their lookup did not
> -		 * return a result. In that case, use platform lookup as a
> -		 * fallback.
> +		 * If a connection label was passed use that, else attempt to
> +		 * use the device name as label
>   		 */
> -		dev_dbg(consumer, "using lookup tables for GPIO lookup\n");
> -		desc = gpiod_find(consumer, con_id, idx, &lookupflags);
> +		ret = gpiod_request(desc, label);
>   	}
>   
> -	if (IS_ERR(desc)) {
> -		dev_dbg(consumer, "No GPIO consumer %s found\n", con_id);
> -		return desc;
> -	}
> -
> -	/*
> -	 * If a connection label was passed use that, else attempt to use
> -	 * the device name as label
> -	 */
> -	ret = gpiod_request(desc, label);
>   	if (ret) {
>   		if (!(ret == -EBUSY && flags & GPIOD_FLAGS_BIT_NONEXCLUSIVE))
>   			return ERR_PTR(ret);

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup
  2024-01-08 13:03   ` Marek Szyprowski
@ 2024-01-08 15:39     ` Bartosz Golaszewski
  0 siblings, 0 replies; 10+ messages in thread
From: Bartosz Golaszewski @ 2024-01-08 15:39 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Linus Walleij, linux-gpio, linux-kernel, Bartosz Golaszewski

On Mon, Jan 8, 2024 at 2:03 PM Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
>
> On 02.01.2024 16:59, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > There's time between when we locate the relevant descriptor during
> > lookup and when we actually take the reference to its parent GPIO
> > device where - if the GPIO device in question is removed - we'll end up
> > with a dangling pointer to freed memory. Make sure devices cannot be
> > removed until we hold a new reference to the device.
> >
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>
> This patch landed in linux-next as commit db660b9a9f86 ("gpiolib: pin
> GPIO devices in place during descriptor lookup"). Unfortunately it
> introduces a following lock-dep warning:
>
> ============================================
> WARNING: possible recursive locking detected
> 6.7.0-rc7-00062-gdb660b9a9f86 #7819 Not tainted
> --------------------------------------------
> kworker/u4:2/27 is trying to acquire lock:
> c13f4e1c (gpio_devices_sem){++++}-{3:3}, at: gpio_device_find+0x30/0x94
>
> but task is already holding lock:
> c13f4e1c (gpio_devices_sem){++++}-{3:3}, at:
> gpiod_find_and_request+0x44/0x594
>
> other info that might help us debug this:
>   Possible unsafe locking scenario:
>
>         CPU0
>         ----
>    lock(gpio_devices_sem);
>    lock(gpio_devices_sem);
>
>   *** DEADLOCK ***
>
>   May be due to missing lock nesting notation
>
> 4 locks held by kworker/u4:2/27:
>   #0: c1c06ca8 ((wq_completion)events_unbound){+.+.}-{0:0}, at:
> process_one_work+0x148/0x608
>   #1: e093df20 ((work_completion)(&entry->work)){+.+.}-{0:0}, at:
> process_one_work+0x148/0x608
>   #2: c1f3048c (&dev->mutex){....}-{3:3}, at:
> __driver_attach_async_helper+0x38/0xec
>   #3: c13f4e1c (gpio_devices_sem){++++}-{3:3}, at:
> gpiod_find_and_request+0x44/0x594
>
> stack backtrace:
> CPU: 0 PID: 27 Comm: kworker/u4:2 Not tainted
> 6.7.0-rc7-00062-gdb660b9a9f86 #7819
> Hardware name: Samsung Exynos (Flattened Device Tree)
> Workqueue: events_unbound async_run_entry_fn
>   unwind_backtrace from show_stack+0x10/0x14
>   show_stack from dump_stack_lvl+0x58/0x70
>   dump_stack_lvl from __lock_acquire+0x1300/0x2984
>   __lock_acquire from lock_acquire+0x130/0x37c
>   lock_acquire from down_read+0x44/0x224
>   down_read from gpio_device_find+0x30/0x94
>   gpio_device_find from of_get_named_gpiod_flags+0xa4/0x3a8
>   of_get_named_gpiod_flags from of_find_gpio+0x80/0x168
>   of_find_gpio from gpiod_find_and_request+0x120/0x594
>   gpiod_find_and_request from gpiod_get_optional+0x54/0x90
>   gpiod_get_optional from reg_fixed_voltage_probe+0x200/0x400
>   reg_fixed_voltage_probe from platform_probe+0x5c/0xb8
>   platform_probe from really_probe+0xe0/0x400
>   really_probe from __driver_probe_device+0x9c/0x1f0
>   __driver_probe_device from driver_probe_device+0x30/0xc0
>   driver_probe_device from __driver_attach_async_helper+0x54/0xec
>   __driver_attach_async_helper from async_run_entry_fn+0x40/0x154
>   async_run_entry_fn from process_one_work+0x204/0x608
>   process_one_work from worker_thread+0x1e0/0x498
>   worker_thread from kthread+0x104/0x138
>   kthread from ret_from_fork+0x14/0x28
> Exception stack(0xe093dfb0 to 0xe093dff8)
> ...
>
>
> Taking gpio_devices_sem more than once for reading is safe, but it looks
> that it needs some lock-dep annotations to to make it happy and avoid
> the above warning.
>
>
> > ---
> >   drivers/gpio/gpiolib.c | 40 +++++++++++++++++++++++-----------------
> >   1 file changed, 23 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> > index 4c93cf73a826..be57f8d6aeae 100644
> > --- a/drivers/gpio/gpiolib.c
> > +++ b/drivers/gpio/gpiolib.c
> > @@ -4134,27 +4134,33 @@ static struct gpio_desc *gpiod_find_and_request(struct device *consumer,
> >       struct gpio_desc *desc;
> >       int ret;
> >
> > -     desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx, &flags, &lookupflags);
> > -     if (gpiod_not_found(desc) && platform_lookup_allowed) {
> > +     scoped_guard(rwsem_read, &gpio_devices_sem) {
> > +             desc = gpiod_find_by_fwnode(fwnode, consumer, con_id, idx,
> > +                                         &flags, &lookupflags);
> > +             if (gpiod_not_found(desc) && platform_lookup_allowed) {
> > +                     /*
> > +                      * Either we are not using DT or ACPI, or their lookup
> > +                      * did not return a result. In that case, use platform
> > +                      * lookup as a fallback.
> > +                      */
> > +                     dev_dbg(consumer,
> > +                             "using lookup tables for GPIO lookup\n");
> > +                     desc = gpiod_find(consumer, con_id, idx, &lookupflags);
> > +             }
> > +
> > +             if (IS_ERR(desc)) {
> > +                     dev_dbg(consumer, "No GPIO consumer %s found\n",
> > +                             con_id);
> > +                     return desc;
> > +             }
> > +
> >               /*
> > -              * Either we are not using DT or ACPI, or their lookup did not
> > -              * return a result. In that case, use platform lookup as a
> > -              * fallback.
> > +              * If a connection label was passed use that, else attempt to
> > +              * use the device name as label
> >                */
> > -             dev_dbg(consumer, "using lookup tables for GPIO lookup\n");
> > -             desc = gpiod_find(consumer, con_id, idx, &lookupflags);
> > +             ret = gpiod_request(desc, label);
> >       }
> >
> > -     if (IS_ERR(desc)) {
> > -             dev_dbg(consumer, "No GPIO consumer %s found\n", con_id);
> > -             return desc;
> > -     }
> > -
> > -     /*
> > -      * If a connection label was passed use that, else attempt to use
> > -      * the device name as label
> > -      */
> > -     ret = gpiod_request(desc, label);
> >       if (ret) {
> >               if (!(ret == -EBUSY && flags & GPIOD_FLAGS_BIT_NONEXCLUSIVE))
> >                       return ERR_PTR(ret);
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>

Thanks for the report. I think it may have come too late in the
release cycle as it has the potential to break a lot of things. I will
back it out of my for-next branch. I'll resend it for v6.9.

Bart

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-01-08 15:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-02 15:59 [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list Bartosz Golaszewski
2024-01-02 15:59 ` [PATCH v2 1/3] gpiolib: remove the GPIO device from the list when it's unregistered Bartosz Golaszewski
2024-01-02 22:11   ` Linus Walleij
2024-01-02 15:59 ` [PATCH v2 2/3] gpiolib: replace the GPIO device mutex with a read-write semaphore Bartosz Golaszewski
2024-01-02 22:12   ` Linus Walleij
2024-01-02 15:59 ` [PATCH v2 3/3] gpiolib: pin GPIO devices in place during descriptor lookup Bartosz Golaszewski
2024-01-02 22:15   ` Linus Walleij
2024-01-08 13:03   ` Marek Szyprowski
2024-01-08 15:39     ` Bartosz Golaszewski
2024-01-04  9:39 ` [PATCH v2 0/3] gpiolib: use a read-write semaphore to protect the GPIO device list Bartosz Golaszewski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox