From: Chao Leng <lengchao@huawei.com>
To: Sagi Grimberg <sagi@grimberg.me>,
<linux-nvme@lists.infradead.org>, <linux-block@vger.kernel.org>
Cc: <hch@lst.de>, <kbusch@kernel.org>, <axboe@kernel.dk>,
<ming.lei@redhat.com>, <paulmck@kernel.org>
Subject: Re: [PATCH v3 2/2] nvme: use blk_mq_[un]quiesce_tagset
Date: Thu, 20 Oct 2022 17:47:46 +0800 [thread overview]
Message-ID: <ffba2ecb-0c91-2960-9d83-08d7a3248cb1@huawei.com> (raw)
In-Reply-To: <13088882-2d1b-ca71-8420-84bb47760cff@grimberg.me>
On 2022/10/20 14:11, Sagi Grimberg wrote:
>
>
> On 10/20/22 06:53, Chao Leng wrote:
>> All controller namespaces share the same tagset, so we can use this
>> interface which does the optimal operation for parallel quiesce based on
>> the tagset type(e.g. blocking tagsets and non-blocking tagsets).
>>
>> nvme connect_q should not be quiesced when quiesce tagset, so set the
>> QUEUE_FLAG_SKIP_TAGSET_QUIESCE to skip it when init connect_q.
>>
>> Currntely we use NVME_NS_STOPPED to ensure pairing quiescing and
>> unquiescing. If use blk_mq_[un]quiesce_tagset, NVME_NS_STOPPED will be
>> invalided, so introduce NVME_CTRL_STOPPED to replace NVME_NS_STOPPED.
>> In addition, we never really quiesce a single namespace. It is a better
>> choice to move the flag from ns to ctrl.
>>
>> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
>> Signed-off-by: Chao Leng <lengchao@huawei.com>
>> ---
>> drivers/nvme/host/core.c | 57 +++++++++++++++++++-----------------------------
>> drivers/nvme/host/nvme.h | 3 ++-
>> 2 files changed, 25 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index 059737c1a2c1..c7727d1f228e 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -4890,6 +4890,7 @@ int nvme_alloc_io_tag_set(struct nvme_ctrl *ctrl, struct blk_mq_tag_set *set,
>> ret = PTR_ERR(ctrl->connect_q);
>> goto out_free_tag_set;
>> }
>> + blk_queue_flag_set(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, ctrl->connect_q);
>> }
>> ctrl->tagset = set;
>> @@ -5013,6 +5014,7 @@ int nvme_init_ctrl(struct nvme_ctrl *ctrl, struct device *dev,
>> clear_bit(NVME_CTRL_FAILFAST_EXPIRED, &ctrl->flags);
>> spin_lock_init(&ctrl->lock);
>> mutex_init(&ctrl->scan_lock);
>> + mutex_init(&ctrl->queue_state_lock);
>
> Why is this lock needed?
It is used to secure the process which need that the queue must be quiesced.
The scenario:(without the lock)
Thread A: call nvme_stop_queues and set the NVME_CTRL_STOPPED and then quiesce the tagset
and wait the grace period.
Thread B: call nvme_stop_queues, because the NVME_CTRL_STOPPED is already setted,
continue to do something which need that the queue must be quiesced,
because the grace period of the queue is not ended, may cause abnormal.
Thread A: the grace period end, and continue.
So add a lock to ensure that all queues are quiesced after set the NVME_CTRL_STOPPED.
The old code was implemented by forcing a wait for the grace period. Show the code:
if (!test_and_set_bit(NVME_NS_STOPPED, &ns->flags))
blk_mq_quiesce_queue(ns->queue);
else
blk_mq_wait_quiesce_done(ns->queue);
The old code was not absolutely safe, such as this scenario:
Thread A: test_and_set_bit, and interrupt by hardware irq, lost chance to run.
Thread B: test_and_set_bit, and wait the grace period, and then continue
to do something which need that the queue must be quiesced,
because the queue is not quiesced, may cause abnormal.
Thread A: get the chance to run, blk_mq_quiesce_queue, and then continue.
>
> .
next prev parent reply other threads:[~2022-10-20 9:48 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-20 3:53 [PATCH v2 0/2] improve nvme quiesce time for large amount of namespaces Chao Leng
2022-10-20 3:53 ` [PATCH v3 1/2] blk-mq: add tagset quiesce interface Chao Leng
2022-10-20 16:11 ` Paul E. McKenney
2022-10-20 3:53 ` [PATCH v3 2/2] nvme: use blk_mq_[un]quiesce_tagset Chao Leng
2022-10-20 6:11 ` Sagi Grimberg
2022-10-20 9:47 ` Chao Leng [this message]
2022-10-20 6:34 ` [PATCH v2 0/2] improve nvme quiesce time for large amount of namespaces Christoph Hellwig
2022-10-20 9:36 ` Chao Leng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ffba2ecb-0c91-2960-9d83-08d7a3248cb1@huawei.com \
--to=lengchao@huawei.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=ming.lei@redhat.com \
--cc=paulmck@kernel.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox