linux-iio.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <jic23@kernel.org>
To: Fabrice Gasnier <fabrice.gasnier@st.com>
Cc: <lars@metafoo.de>, <knaack.h@gmx.de>, <pmeerw@pmeerw.net>,
	<linux-kernel@vger.kernel.org>, <linux-iio@vger.kernel.org>
Subject: Re: [RFC PATCH] iio: core: fix a possible circular locking dependency
Date: Sun, 24 Mar 2019 15:47:04 +0000	[thread overview]
Message-ID: <20190324154704.277327e2@archlinux> (raw)
In-Reply-To: <1553262846-23126-1-git-send-email-fabrice.gasnier@st.com>

On Fri, 22 Mar 2019 14:54:06 +0100
Fabrice Gasnier <fabrice.gasnier@st.com> wrote:

> This fixes a possible circular locking dependency detected warning seen
> with:
> - CONFIG_PROVE_LOCKING=y
> - consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")
> 
> When using the IIO consumer interface, e.g. iio_channel_get(),
> the consumer device will likely call iio_read_channel_raw() or similar that
> rely on 'info_exist_lock' mutex.
> 
> typically:
> ...
> 	mutex_lock(&chan->indio_dev->info_exist_lock);
> 	if (chan->indio_dev->info == NULL) {
> 		ret = -ENODEV;
> 		goto err_unlock;
> 	}
> 	ret = do_some_ops()
> err_unlock:
> 	mutex_unlock(&chan->indio_dev->info_exist_lock);
> 	return ret;
> ...
> 
> Same mutex is also hold in iio_device_unregister().
> 
> The following deadlock warning happens when:
> - the consumer device has called an API like iio_read_channel_raw()
>   at least once.
> - the consumer driver is unregistered, removed (unbind from sysfs)
> 
> ======================================================
> WARNING: possible circular locking dependency detected
> 4.19.24 #577 Not tainted
> ------------------------------------------------------
> sh/372 is trying to acquire lock:
> (kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84
> 
> but task is already holding lock:
> (&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60
> 
> which lock already depends on the new lock.
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #1 (&dev->info_exist_lock){+.+.}:  
>        __mutex_lock+0x70/0xa3c
>        mutex_lock_nested+0x1c/0x24
>        iio_read_channel_raw+0x1c/0x60
>        iio_read_channel_info+0xa8/0xb0
>        dev_attr_show+0x1c/0x48
>        sysfs_kf_seq_show+0x84/0xec
>        seq_read+0x154/0x528
>        __vfs_read+0x2c/0x15c
>        vfs_read+0x8c/0x110
>        ksys_read+0x4c/0xac
>        ret_fast_syscall+0x0/0x28
>        0xbedefb60
> 
> -> #0 (kn->count#30){++++}:  
>        lock_acquire+0xd8/0x268
>        __kernfs_remove+0x288/0x374
>        kernfs_remove_by_name_ns+0x3c/0x84
>        remove_files+0x34/0x78
>        sysfs_remove_group+0x40/0x9c
>        sysfs_remove_groups+0x24/0x34
>        device_remove_attrs+0x38/0x64
>        device_del+0x11c/0x360
>        cdev_device_del+0x14/0x2c
>        iio_device_unregister+0x24/0x60
>        release_nodes+0x1bc/0x200
>        device_release_driver_internal+0x1a0/0x230
>        unbind_store+0x80/0x130
>        kernfs_fop_write+0x100/0x1e4
>        __vfs_write+0x2c/0x160
>        vfs_write+0xa4/0x17c
>        ksys_write+0x4c/0xac
>        ret_fast_syscall+0x0/0x28
>        0xbe906840
> 
> other info that might help us debug this:
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(&dev->info_exist_lock);
>                                lock(kn->count#30);
>                                lock(&dev->info_exist_lock);
>   lock(kn->count#30);
> 
>  *** DEADLOCK ***
> ...
> 
> So only hold the mutex to:
> - disable all buffers while 'info' is available
> - set 'info' to NULL
> Then release it to call cdev_device_del and so on.
> 
> Help to reproduce:
> See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
> sysv {
> 	compatible = "voltage-divider";
> 	io-channels = <&adc 0>;
> 	output-ohms = <22>;
> 	full-ohms = <222>;
> };
> 
> First, go to iio:deviceX for the "voltage-divider", do one read:
> $ cd /sys/bus/iio/devices/iio:deviceX
> $ cat in_voltage0_raw
> 
> Then, unbind the consumer driver. It triggers above deadlock warning.
> $ cd /sys/bus/platform/drivers/iio-rescale/
> $ echo sysv > unbind
> 
> Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
I'm not in principle against the fix. However it is reordering the
remove wrt to the probe which I'm not so keen on.

The cdev register is fundamentally the point where the device
becomes exposed to userspace, so we naturally want to do it last
(and remove it first).  I worry that we may have some paths
in which we don't sanity check the existence of info (which
is kind of our backup plan to indicate the device has gone
away).

Are we safe to instead of reordering, just not take the lock
until after the problem functions?  Info doesn't go
away until later so I think we are.  I haven't looked it in that
much detail though!

Thanks for raising this as it's a nasty little problem.

Jonathan


> ---
>  drivers/iio/industrialio-core.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iio/industrialio-core.c b/drivers/iio/industrialio-core.c
> index 4700fd5..e03d6ff 100644
> --- a/drivers/iio/industrialio-core.c
> +++ b/drivers/iio/industrialio-core.c
> @@ -1745,19 +1745,19 @@ void iio_device_unregister(struct iio_dev *indio_dev)
>  {
>  	mutex_lock(&indio_dev->info_exist_lock);
>  
> -	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
> -
> -	iio_device_unregister_debugfs(indio_dev);
> -
>  	iio_disable_all_buffers(indio_dev);
>  
>  	indio_dev->info = NULL;
>  
> +	mutex_unlock(&indio_dev->info_exist_lock);
> +
> +	cdev_device_del(&indio_dev->chrdev, &indio_dev->dev);
> +
> +	iio_device_unregister_debugfs(indio_dev);
> +
>  	iio_device_wakeup_eventset(indio_dev);
>  	iio_buffer_wakeup_poll(indio_dev);
>  
> -	mutex_unlock(&indio_dev->info_exist_lock);
> -
>  	iio_buffer_free_sysfs_and_mask(indio_dev);
>  }
>  EXPORT_SYMBOL(iio_device_unregister);


  reply	other threads:[~2019-03-24 15:47 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22 13:54 [RFC PATCH] iio: core: fix a possible circular locking dependency Fabrice Gasnier
2019-03-24 15:47 ` Jonathan Cameron [this message]
2019-03-25 12:51   ` Fabrice Gasnier
2019-03-30 18:46     ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190324154704.277327e2@archlinux \
    --to=jic23@kernel.org \
    --cc=fabrice.gasnier@st.com \
    --cc=knaack.h@gmx.de \
    --cc=lars@metafoo.de \
    --cc=linux-iio@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmeerw@pmeerw.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).