From: Ming Lei <ming.lei@redhat.com>
To: Mohamed Khalfella <mkhalfella@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>, Keith Busch <kbusch@kernel.org>,
Sagi Grimberg <sagi@grimberg.me>,
Chaitanya Kulkarni <kch@nvidia.com>,
Casey Chen <cachen@purestorage.com>,
Vikas Manocha <vmanocha@purestorage.com>,
Yuanyuan Zhong <yzhong@purestorage.com>,
Hannes Reinecke <hare@suse.de>,
linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/1] nvme: Convert tag_list mutex to rwsemaphore to avoid deadlock
Date: Tue, 18 Nov 2025 12:16:10 +0800 [thread overview]
Message-ID: <aRvzCs_FXa_liTWv@fedora> (raw)
In-Reply-To: <20251118034405.GB2376676-mkhalfella@purestorage.com>
On Mon, Nov 17, 2025 at 07:44:05PM -0800, Mohamed Khalfella wrote:
> On Tue 2025-11-18 10:30:52 +0800, Ming Lei wrote:
> > On Mon, Nov 17, 2025 at 06:15:04PM -0800, Mohamed Khalfella wrote:
> > > On Tue 2025-11-18 10:00:19 +0800, Ming Lei wrote:
> > > > On Mon, Nov 17, 2025 at 12:23:53PM -0800, Mohamed Khalfella wrote:
> > > > > static void blk_mq_add_queue_tag_set(struct blk_mq_tag_set *set,
> > > > > struct request_queue *q)
> > > > > {
> > > > > - mutex_lock(&set->tag_list_lock);
> > > > > + struct request_queue *firstq;
> > > > > + unsigned int memflags;
> > > > >
> > > > > - /*
> > > > > - * Check to see if we're transitioning to shared (from 1 to 2 queues).
> > > > > - */
> > > > > - if (!list_empty(&set->tag_list) &&
> > > > > - !(set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
> > > > > - set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > > > > - /* update existing queue */
> > > > > - blk_mq_update_tag_set_shared(set, true);
> > > > > - }
> > > > > - if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > > > > - queue_set_hctx_shared(q, true);
> > > > > - list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > + down_write(&set->tag_list_rwsem);
> > > > > + if (!list_is_singular(&set->tag_list)) {
> > > > > + if (set->flags & BLK_MQ_F_TAG_QUEUE_SHARED)
> > > > > + queue_set_hctx_shared(q, true);
> > > > > + list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > + up_write(&set->tag_list_rwsem);
> > > > > + return;
> > > > > + }
> > > > >
> > > > > - mutex_unlock(&set->tag_list_lock);
> > > > > + /* Transitioning firstq and q to shared. */
> > > > > + set->flags |= BLK_MQ_F_TAG_QUEUE_SHARED;
> > > > > + list_add_tail(&q->tag_set_list, &set->tag_list);
> > > > > + downgrade_write(&set->tag_list_rwsem);
> > > > > + queue_set_hctx_shared(q, true);
> > > >
> > > > queue_set_hctx_shared(q, true) should be moved into write critical area
> > > > because this queue has been added to the list.
> > > >
> > >
> > > I failed to see why that is the case. What can go wrong by running
> > > queue_set_hctx_shared(q, true) after downgrade_write()?
> > >
> > > After the semaphore is downgraded we promise not to change the list
> > > set->tag_list because now we have read-only access. Marking the "q" as
> > > shared should be fine because it is new and we know there will be no
> > > users of the queue yet (that is why we skipped freezing it).
> >
> > I think it is read/write lock's use practice. The protected data shouldn't be
> > written any more when you downgrade to read lock.
> >
> > In this case, it may not make a difference, because it is one new queue and
> > the other readers don't use the `shared` flag, but still better to do
> > correct things from beginning and make code less fragile.
> >
>
> set->tag_list_rwsem protects set->tag_list. It does not protect
> hctx->flags. hctx->flags is protected by the context. In the case of "q"
> it is new and we are not expecting request allocation. In case of
> "firstq" the queue is frozen which makes it safe to update hctx->flags.
> I prefer to keep the code as it is unless there is a reason to change
> it.
Fair enough, given it is done for `firstrq`.
Thanks,
Ming
next prev parent reply other threads:[~2025-11-18 4:16 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-17 20:23 [PATCH v2 0/1] nvme: Convert tag_list mutex to rwsemaphore to avoid deadlock Mohamed Khalfella
2025-11-17 20:23 ` [PATCH v2 1/1] " Mohamed Khalfella
2025-11-18 1:34 ` Hillf Danton
2025-11-18 2:07 ` Mohamed Khalfella
2025-11-18 2:24 ` Waiman Long
2025-11-18 3:08 ` Ming Lei
2025-11-18 3:42 ` Waiman Long
2025-11-18 4:03 ` Ming Lei
[not found] ` <702d904c-2d9a-42b4-95b3-0fa43d91e673@redhat.com>
2025-11-18 4:50 ` Ming Lei
2025-11-18 17:59 ` Mohamed Khalfella
2025-11-18 5:02 ` Mohamed Khalfella
2025-11-18 2:00 ` Ming Lei
2025-11-18 2:15 ` Mohamed Khalfella
2025-11-18 2:30 ` Ming Lei
2025-11-18 3:44 ` Mohamed Khalfella
2025-11-18 4:16 ` Ming Lei [this message]
2025-11-24 4:00 ` Ming Lei
2025-11-24 17:42 ` Mohamed Khalfella
2025-12-01 1:36 ` Hillf Danton
2025-12-01 2:50 ` Jens Axboe
2025-11-30 22:56 ` Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aRvzCs_FXa_liTWv@fedora \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=cachen@purestorage.com \
--cc=hare@suse.de \
--cc=kbusch@kernel.org \
--cc=kch@nvidia.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=mkhalfella@purestorage.com \
--cc=sagi@grimberg.me \
--cc=vmanocha@purestorage.com \
--cc=yzhong@purestorage.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.