All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, Greg KH <gregkh@linuxfoundation.org>,
	Mike Snitzer <snitzer@redhat.com>
Subject: Re: [PATCH] block: don't acquire .sysfs_lock before removing mq & iosched kobjects
Date: Mon, 19 Aug 2019 16:15:38 +0800	[thread overview]
Message-ID: <20190819081536.GA9852@ming.t460p> (raw)
In-Reply-To: <429c8ae2-894a-1eb2-83d3-95703d1573cf@acm.org>

On Fri, Aug 16, 2019 at 08:14:13AM -0700, Bart Van Assche wrote:
> On 8/16/19 6:55 AM, Ming Lei wrote:
> > The kernfs built-in lock of 'kn->count' is held in sysfs .show/.store
> > path. Meantime, inside block's .show/.store callback, q->sysfs_lock is
> > required.
> > 
> > However, when mq & iosched kobjects are removed via
> > blk_mq_unregister_dev() & elv_unregister_queue(), q->sysfs_lock is held
> > too. This way causes AB-BA lock because the kernfs built-in lock of
> > 'kn-count' is required inside kobject_del() too, see the lockdep warning[1].
> > 
> > On the other hand, it isn't necessary to acquire q->sysfs_lock for
> > both blk_mq_unregister_dev() & elv_unregister_queue() because
> > clearing REGISTERED flag prevents storing to 'queue/scheduler'
> > from being happened. Also sysfs write(store) is exclusive, so no
> > necessary to hold the lock for elv_unregister_queue() when it is
> > called in switching elevator path.
> > 
> > Fixes the issue by not holding the q->sysfs_lock for blk_mq_unregister_dev() &
> > elv_unregister_queue().
> 
> Have you considered to split sysfs_lock into multiple mutexes? Today it is

So far, not consider it. At least now, I just don't see the need to hold
sysfs_lock for both blk_mq_unregister_dev() & elv_unregister_queue().
Then we can fix the deadlock issue which can be triggered quite easily,
and the fix should be for -stable.

Yeah, I agree that sysfs_lock has been used too widely.

> very hard to verify the correctness of block layer code that uses sysfs_lock
> because it has not been documented anywhere what that mutex protects. I
> think that mutex should be split into at least two mutexes: one that
> protects switching I/O schedulers and another one that protects hctx->tags
> and hctx->sched_tags.

sysfs_lock is required in any .show & .store callback, and switching I/O
scheduler is done in .store(), then hctx->sched_tags is protected by sysfs_lock
too.

hctx->tags is tagset wide or host-wide, which is protected by set->tag_list_lock.


Thanks,
Ming

  reply	other threads:[~2019-08-19  8:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-16 13:55 [PATCH] block: don't acquire .sysfs_lock before removing mq & iosched kobjects Ming Lei
2019-08-16 15:14 ` Bart Van Assche
2019-08-19  8:15   ` Ming Lei [this message]
2019-08-20 21:07     ` Bart Van Assche
2019-08-21  2:45       ` Ming Lei
2019-08-16 15:31 ` Bart Van Assche
2019-08-16 15:45   ` Ming Lei
2019-08-20 21:21 ` Bart Van Assche
2019-08-21  3:00   ` Ming Lei
2019-08-21 15:41     ` Bart Van Assche
2019-08-22  1:16       ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190819081536.GA9852@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.