All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()
Date: Tue, 16 Jan 2018 17:32:08 -0500	[thread overview]
Message-ID: <20180116223207.GA32637@redhat.com> (raw)
In-Reply-To: <20180116181752.25847-4-bart.vanassche@wdc.com>

On Tue, Jan 16 2018 at  1:17pm -0500,
Bart Van Assche <bart.vanassche@wdc.com> wrote:

> The __blk_mq_register_dev(), blk_mq_unregister_dev(),
> elv_register_queue() and elv_unregister_queue() calls need to be
> protected with sysfs_lock but other code in these functions not.
> Hence protect only this code with sysfs_lock. This patch fixes a
> locking inversion issue in blk_unregister_queue() and also in an
> error path of blk_register_queue(): it is not allowed to hold
> sysfs_lock around the kobject_del(&q->kobj) call.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> ---
>  block/blk-sysfs.c | 13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 4a6a40ffd78e..e9ce45ff0ef2 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -909,11 +909,12 @@ int blk_register_queue(struct gendisk *disk)
>  	if (q->request_fn || (q->mq_ops && q->elevator)) {
>  		ret = elv_register_queue(q);
>  		if (ret) {
> +			mutex_unlock(&q->sysfs_lock);
>  			kobject_uevent(&q->kobj, KOBJ_REMOVE);
>  			kobject_del(&q->kobj);
>  			blk_trace_remove_sysfs(dev);
>  			kobject_put(&dev->kobj);
> -			goto unlock;
> +			return ret;
>  		}
>  	}
>  	ret = 0;
> @@ -934,28 +935,22 @@ void blk_unregister_queue(struct gendisk *disk)
>  	if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
>  		return;
>  
> -	/*
> -	 * Protect against the 'queue' kobj being accessed
> -	 * while/after it is removed.
> -	 */
> -	mutex_lock(&q->sysfs_lock);
> -
>  	spin_lock_irq(q->queue_lock);
>  	queue_flag_clear(QUEUE_FLAG_REGISTERED, q);
>  	spin_unlock_irq(q->queue_lock);
>  
>  	wbt_exit(q);
>  
> +	mutex_lock(&q->sysfs_lock);
>  	if (q->mq_ops)
>  		blk_mq_unregister_dev(disk_to_dev(disk), q);
>  
>  	if (q->request_fn || (q->mq_ops && q->elevator))
>  		elv_unregister_queue(q);

My concern with this change is detailed in the following portion of
the header for commit 667257e8b2988c0183ba23e2bcd6900e87961606:

    2) Conversely, __elevator_change() is testing for QUEUE_FLAG_REGISTERED
    in case elv_iosched_store() loses the race with blk_unregister_queue(),
    it needs a way to know the 'queue' kobj isn't there.

I don't think moving mutex_lock(&q->sysfs_lock); after the clearing of
QUEUE_FLAG_REGISTERED is a step in the right direction.

Current code shows:

blk_cleanup_queue() calls blk_set_queue_dying() while holding
the sysfs_lock.

queue_attr_{show,store} both test if blk_queue_dying(q) while holding
the sysfs_lock.

BUT drivers can/do call del_gendisk() _before_ blk_cleanup_queue().
(if your proposed change above were to go in all of the block drivers
would first need to be audited for the need to call blk_cleanup_queue()
before del_gendisk() -- seems awful).

Therefore it seems to me that all queue_attr_{show,store} are racey vs
blk_unregister_queue() removing the 'queue' kobject.

And it was just that __elevator_change() was myopicly fixed to address
the race whereas a more generic solution was/is needed.  But short of
that more generic fix your change will reintroduce the potential for
hitting the issue that commit e9a823fb34a8b fixed.

In that light, think it best to leave blk_unregister_queue()'s
mutex_lock() above the QUEUE_FLAG_REGISTERED clearing _and_ update
queue_attr_{show,store} to test for QUEUE_FLAG_REGISTERED while holding
sysfs_lock.

Then remove the unicorn test_bit for QUEUE_FLAG_REGISTERED from
__elevator_change().

But it could be I'm wrong for some reason.. as you know that happens ;)

Mike

  reply	other threads:[~2018-01-16 22:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-16 18:17 [PATCH 0/3] Avoid that blk_{un,}register_queue() trigger lock inversion Bart Van Assche
2018-01-16 18:17 ` [PATCH 1/3] block: Unexport elv_register_queue() and elv_unregister_queue() Bart Van Assche
2018-01-16 18:17 ` [PATCH 2/3] block: Document scheduler change code locking requirements Bart Van Assche
2018-01-16 18:17 ` [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue() Bart Van Assche
2018-01-16 22:32   ` Mike Snitzer [this message]
2018-01-17  0:03     ` Bart Van Assche
2018-01-17  1:23       ` Ming Lei
2018-01-17  2:17         ` Bart Van Assche
2018-01-17 13:04           ` Ming Lei
2018-01-17 17:19             ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180116223207.GA32637@redhat.com \
    --to=snitzer@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bart.vanassche@wdc.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.