From: Ming Lei <ming.lei@redhat.com>
To: Mohamed Khalfella <mkhalfella@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>, Keith Busch <kbusch@kernel.org>,
Sagi Grimberg <sagi@grimberg.me>,
Chaitanya Kulkarni <kch@nvidia.com>,
Casey Chen <cachen@purestorage.com>,
Vikas Manocha <vmanocha@purestorage.com>,
Yuanyuan Zhong <yzhong@purestorage.com>,
Hannes Reinecke <hare@suse.de>,
linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/1] nvme: Convert tag_list mutex to rwsemaphore to avoid deadlock
Date: Tue, 18 Nov 2025 12:16:10 +0800 [thread overview]
Message-ID: <aRvzCs_FXa_liTWv@fedora> (raw)
In-Reply-To: <20251118034405.GB2376676-mkhalfella@purestorage.com>
On Mon, Nov 17, 2025 at 07:44:05PM -0800, Mohamed Khalfella wrote:
> On Tue 2025-11-18 10:30:52 +0800, Ming Lei wrote:
> > On Mon, Nov 17, 2025 at 06:15:04PM -0800, Mohamed Khalfella wrote:
> > > On Tue 2025-11-18 10:00:19 +0800, Ming Lei wrote:
> > > > On Mon, Nov 17, 2025 at 12:23:53PM -0800, Mohamed Khalfella wrote:
> > > > > static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set,
> > > > > struct request_queue *q)
> > > > > {
> > > > > - mutex_lock(&set->tag_list_lock);
> > > > > + struct request_queue *firstq;
> > > > > + unsigned int memflags;
> > > > >
> > > > > - /*
> > > > > - * Check to see if we're transitioning to shared (from 1 to 2 queues).
> > > > > - */
> > > > > - if (!list_empty(&set->tag_list) &&
> > > > > - !(set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
> > > > > - set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > > > > - /* update existing queue */
> > > > > - blk_mq_update_tag_set_shared(set, true);
> > > > > - }
> > > > > - if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > > > > - queue_set_hctx_shared(q, true);
> > > > > - list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > + down_write(&set->tag_list_rwsem);
> > > > > + if (!list_is_singular(&set->tag_list)) {
> > > > > + if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > > > > + queue_set_hctx_shared(q, true);
> > > > > + list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > + up_write(&set->tag_list_rwsem);
> > > > > + return;
> > > > > + }
> > > > >
> > > > > - mutex_unlock(&set->tag_list_lock);
> > > > > + /* Transitioning firstq and q to shared. */
> > > > > + set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > > > > + list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > + downgrade_write(&set->tag_list_rwsem);
> > > > > + queue_set_hctx_shared(q, true);
> > > >
> > > > queue_set_hctx_shared(q, true) should be moved into write critical area
> > > > because this queue has been added to the list.
> > > >
> > >
> > > I failed to see why that is the case. What can go wrong by running
> > > queue_set_hctx_shared(q, true) after downgrade_write()?
> > >
> > > After the semaphore is downgraded we promise not to change the list
> > > set->tag_list because now we have read-only access. Marking the "q" as
> > > shared should be fine because it is new and we know there will be no
> > > users of the queue yet (that is why we skipped freezing it).
> >
> > I think it is read/write lock's use practice. The protected data shouldn't be
> > written any more when you downgrade to read lock.
> >
> > In this case, it may not make a difference, because it is one new queue and
> > the other readers don't use the `shared` flag, but still better to do
> > correct things from beginning and make code less fragile.
> >
>
> set->tag_list_rwsem protects set->tag_list. It does not protect
> hctx->flags. hctx->flags is protected by the context. In the case of "q"
> it is new and we are not expecting request allocation. In case of
> "firstq" the queue is frozen which makes it safe to update hctx->flags.
> I prefer to keep the code as it is unless there is a reason to change
> it.
Fair enough, given it is done for `firstrq`.
Thanks,
Ming
next prev parent reply other threads:[~2025-11-18 4:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-17 20:23 [PATCH v2 0/1] nvme: Convert tag_list mutex to rwsemaphore to avoid deadlock Mohamed Khalfella
2025-11-17 20:23 ` [PATCH v2 1/1] " Mohamed Khalfella
2025-11-18 1:34 ` Hillf Danton
2025-11-18 2:07 ` Mohamed Khalfella
2025-11-18 2:24 ` Waiman Long
2025-11-18 3:08 ` Ming Lei
2025-11-18 3:42 ` Waiman Long
2025-11-18 4:03 ` Ming Lei
[not found] ` <702d904c-2d9a-42b4-95b3-0fa43d91e673@redhat.com>
2025-11-18 4:50 ` Ming Lei
2025-11-18 17:59 ` Mohamed Khalfella
2025-11-18 5:02 ` Mohamed Khalfella
2025-11-18 2:00 ` Ming Lei
2025-11-18 2:15 ` Mohamed Khalfella
2025-11-18 2:30 ` Ming Lei
2025-11-18 3:44 ` Mohamed Khalfella
2025-11-18 4:16 ` Ming Lei [this message]
2025-11-24 4:00 ` Ming Lei
2025-11-24 17:42 ` Mohamed Khalfella
2025-12-01 1:36 ` Hillf Danton
2025-12-01 2:50 ` Jens Axboe
2025-11-30 22:56 ` Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRvzCs_FXa_liTWv@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=cachen@purestorage.com \
--cc=hare@suse.de \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=mkhalfella@purestorage.com \
--cc=sagi@grimberg.me \
--cc=vmanocha@purestorage.com \
--cc=yzhong@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).