From: Ming Lei <ming.lei@redhat.com>
To: Nilay Shroff <nilay@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
syzbot+4c7e0f9b94ad65811efb@syzkaller.appspotmail.com
Subject: Re: [PATCH] block: don't grab elevator lock during queue initialization
Date: Tue, 8 Apr 2025 15:38:08 +0800 [thread overview]
Message-ID: <Z_TSYOzPI3GwVms7@fedora> (raw)
In-Reply-To: <ea09ea46-4772-4947-a9ad-195e83f1490d@linux.ibm.com>
On Mon, Apr 07, 2025 at 01:59:48PM +0530, Nilay Shroff wrote:
>
>
> On 4/7/25 8:39 AM, Ming Lei wrote:
> > On Sat, Apr 05, 2025 at 07:44:19PM +0530, Nilay Shroff wrote:
> >>
> >>
> >> On 4/4/25 2:40 PM, Christoph Hellwig wrote:
> >>> On Thu, Apr 03, 2025 at 06:54:02PM +0800, Ming Lei wrote:
> >>>> Fixes the following lockdep warning:
> >>>
> >>> Please spell the actual dependency out here, links are not permanent
> >>> and also not readable for any offline reading of the commit logs.
> >>>
> >>>> +static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
> >>>> + struct request_queue *q, bool lock)
> >>>> +{
> >>>> + if (lock) {
> >>>
> >>> bool lock(ed) arguments are an anti-pattern, and regularly get Linus
> >>> screaming at you (in this case even for the right reason :))
> >>>
> >>>> + /* protect against switching io scheduler */
> >>>> + mutex_lock(&q->elevator_lock);
> >>>> + __blk_mq_realloc_hw_ctxs(set, q);
> >>>> + mutex_unlock(&q->elevator_lock);
> >>>> + } else {
> >>>> + __blk_mq_realloc_hw_ctxs(set, q);
> >>>> + }
> >>>
> >>> I think the problem here is again that because of all the other
> >>> dependencies elevator_lock really needs to be per-set instead of
> >>> per-queue which will allows us to have much saner locking hierarchies.
> >>>
> >> I believe you meant here q->tag_set->elevator_lock?
> >
> > I don't know what locks you are planning to invent.
> >
> > For set->tag_list_lock, it has been very fragile:
> >
> > blk_mq_update_nr_hw_queues
> > set->tag_list_lock
> > freeze_queue
> >
> > If IO failure happens when waiting in above freeze_queue(), the nvme error
> > handling can't provide forward progress any more, because the error
> > handling code path requires set->tag_list_lock.
>
> I think you're referring here nvme_quiesce_io_queues and nvme_unquiesce_io_queues
Yes.
> which is called in nvme error handling path. If yes then I believe this function
> could be easily modified so that it doesn't require ->tag_list_lock.
Not sure it is easily, ->tag_list_lock is exactly for protecting the list of "set->tag_list".
And the same list is iterated in blk_mq_update_nr_hw_queues() too.
>
> >
> > So all queues should be frozen first before calling blk_mq_update_nr_hw_queues,
> > fortunately that is what nvme is doing.
> >
> >
> >> If yes then it means that we should be able to grab ->elevator_lock
> >> before freezing the queue in __blk_mq_update_nr_hw_queues and so locking
> >> order should be in each code path,
> >>
> >> __blk_mq_update_nr_hw_queues
> >> ->elevator_lock
> >> ->freeze_lock
> >
> > Now tagset->elevator_lock depends on set->tag_list_lock, and this way
> > just make things worse. Why can't we disable elevator switch during
> > updating nr_hw_queues?
> >
> I couldn't quite understand this. As we already first disable the elevator
> before updating sw to hw queue mapping in __blk_mq_update_nr_hw_queues().
> Once mapping is successful we switch back the elevator.
Yes, but user still may switch elevator from none to others during the
period, right?
thanks,
Ming
next prev parent reply other threads:[~2025-04-08 7:38 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-03 10:54 [PATCH] block: don't grab elevator lock during queue initialization Ming Lei
2025-04-03 13:19 ` Nilay Shroff
2025-04-03 14:24 ` Ming Lei
2025-04-05 14:00 ` Nilay Shroff
2025-04-03 14:32 ` Jens Axboe
2025-04-04 9:10 ` Christoph Hellwig
2025-04-04 12:09 ` Ming Lei
2025-04-07 6:49 ` Christoph Hellwig
2025-04-05 14:14 ` Nilay Shroff
2025-04-07 3:09 ` Ming Lei
2025-04-07 8:29 ` Nilay Shroff
2025-04-08 7:38 ` Ming Lei [this message]
2025-04-08 13:25 ` Nilay Shroff
2025-04-08 13:50 ` Ming Lei
2025-04-09 9:12 ` Nilay Shroff
2025-04-09 11:46 ` Ming Lei
2025-04-09 13:46 ` Nilay Shroff
2025-04-09 14:08 ` Ming Lei
2025-04-09 19:45 ` Nilay Shroff
2025-04-10 2:10 ` Ming Lei
2025-04-10 13:36 ` Nilay Shroff
2025-04-10 14:23 ` Ming Lei
2025-04-10 14:48 ` Nilay Shroff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z_TSYOzPI3GwVms7@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=nilay@linux.ibm.com \
--cc=syzbot+4c7e0f9b94ad65811efb@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).