From: Mike Snitzer <snitzer@redhat.com>
To: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue()
Date: Tue, 16 Jan 2018 17:32:08 -0500 [thread overview]
Message-ID: <20180116223207.GA32637@redhat.com> (raw)
In-Reply-To: <20180116181752.25847-4-bart.vanassche@wdc.com>
On Tue, Jan 16 2018 at 1:17pm -0500,
Bart Van Assche <bart.vanassche@wdc.com> wrote:
> The __blk_mq_register_dev(), blk_mq_unregister_dev(),
> elv_register_queue() and elv_unregister_queue() calls need to be
> protected with sysfs_lock but other code in these functions not.
> Hence protect only this code with sysfs_lock. This patch fixes a
> locking inversion issue in blk_unregister_queue() and also in an
> error path of blk_register_queue(): it is not allowed to hold
> sysfs_lock around the kobject_del(&q->kobj) call.
>
> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
> ---
> block/blk-sysfs.c | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 4a6a40ffd78e..e9ce45ff0ef2 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -909,11 +909,12 @@ int blk_register_queue(struct gendisk *disk)
> if (q->request_fn || (q->mq_ops && q->elevator)) {
> ret = elv_register_queue(q);
> if (ret) {
> + mutex_unlock(&q->sysfs_lock);
> kobject_uevent(&q->kobj, KOBJ_REMOVE);
> kobject_del(&q->kobj);
> blk_trace_remove_sysfs(dev);
> kobject_put(&dev->kobj);
> - goto unlock;
> + return ret;
> }
> }
> ret = 0;
> @@ -934,28 +935,22 @@ void blk_unregister_queue(struct gendisk *disk)
> if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
> return;
>
> - /*
> - * Protect against the 'queue' kobj being accessed
> - * while/after it is removed.
> - */
> - mutex_lock(&q->sysfs_lock);
> -
> spin_lock_irq(q->queue_lock);
> queue_flag_clear(QUEUE_FLAG_REGISTERED, q);
> spin_unlock_irq(q->queue_lock);
>
> wbt_exit(q);
>
> + mutex_lock(&q->sysfs_lock);
> if (q->mq_ops)
> blk_mq_unregister_dev(disk_to_dev(disk), q);
>
> if (q->request_fn || (q->mq_ops && q->elevator))
> elv_unregister_queue(q);
My concern with this change is detailed in the following portion of
the header for commit 667257e8b2988c0183ba23e2bcd6900e87961606:
2) Conversely, __elevator_change() is testing for QUEUE_FLAG_REGISTERED
in case elv_iosched_store() loses the race with blk_unregister_queue(),
it needs a way to know the 'queue' kobj isn't there.
I don't think moving mutex_lock(&q->sysfs_lock); after the clearing of
QUEUE_FLAG_REGISTERED is a step in the right direction.
Current code shows:
blk_cleanup_queue() calls blk_set_queue_dying() while holding
the sysfs_lock.
queue_attr_{show,store} both test if blk_queue_dying(q) while holding
the sysfs_lock.
BUT drivers can/do call del_gendisk() _before_ blk_cleanup_queue().
(if your proposed change above were to go in all of the block drivers
would first need to be audited for the need to call blk_cleanup_queue()
before del_gendisk() -- seems awful).
Therefore it seems to me that all queue_attr_{show,store} are racey vs
blk_unregister_queue() removing the 'queue' kobject.
And it was just that __elevator_change() was myopicly fixed to address
the race whereas a more generic solution was/is needed. But short of
that more generic fix your change will reintroduce the potential for
hitting the issue that commit e9a823fb34a8b fixed.
In that light, think it best to leave blk_unregister_queue()'s
mutex_lock() above the QUEUE_FLAG_REGISTERED clearing _and_ update
queue_attr_{show,store} to test for QUEUE_FLAG_REGISTERED while holding
sysfs_lock.
Then remove the unicorn test_bit for QUEUE_FLAG_REGISTERED from
__elevator_change().
But it could be I'm wrong for some reason.. as you know that happens ;)
Mike
next prev parent reply other threads:[~2018-01-16 22:32 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-16 18:17 [PATCH 0/3] Avoid that blk_{un,}register_queue() trigger lock inversion Bart Van Assche
2018-01-16 18:17 ` [PATCH 1/3] block: Unexport elv_register_queue() and elv_unregister_queue() Bart Van Assche
2018-01-16 18:17 ` [PATCH 2/3] block: Document scheduler change code locking requirements Bart Van Assche
2018-01-16 18:17 ` [PATCH 3/3] block: Protect less code with sysfs_lock in blk_{un,}register_queue() Bart Van Assche
2018-01-16 22:32 ` Mike Snitzer [this message]
2018-01-17 0:03 ` Bart Van Assche
2018-01-17 1:23 ` Ming Lei
2018-01-17 2:17 ` Bart Van Assche
2018-01-17 13:04 ` Ming Lei
2018-01-17 17:19 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180116223207.GA32637@redhat.com \
--to=snitzer@redhat.com \
--cc=axboe@kernel.dk \
--cc=bart.vanassche@wdc.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox