public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: liruozhu <liruozhu@huawei.com>
To: Chaitanya Kulkarni <chaitanyak@nvidia.com>
Cc: "sagi@grimberg.me" <sagi@grimberg.me>, "hch@lst.de" <hch@lst.de>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: Re: [PATCH v2] nvme: Fix regression when disconnect a recovering ctrl
Date: Tue, 28 Jun 2022 11:01:18 +0800	[thread overview]
Message-ID: <eaf85963-c490-0a2b-ef82-e1d685e518c2@huawei.com> (raw)
In-Reply-To: <c7ecbfa3-53d5-62b8-67ac-42de7711c9dc@nvidia.com>

On 2022/6/28 7:42, Chaitanya Kulkarni wrote:

> On 6/22/22 23:45, Ruozhu Li wrote:
>> We encountered a problem that the disconnect command hangs.
>> After analyzing the log and stack, we found that the triggering
>> process is as follows:
>> CPU0                          CPU1
>>                                   nvme_rdma_error_recovery_work
>>                                     nvme_rdma_teardown_io_queues
>> nvme_do_delete_ctrl                 nvme_stop_queues
>>     nvme_remove_namespaces
>>     --clear ctrl->namespaces
>>                                       nvme_start_queues
>>                                       --no ns in ctrl->namespaces
>>       nvme_ns_remove                  return(because ctrl is deleting)
>>         blk_freeze_queue
>>           blk_mq_freeze_queue_wait
>>           --wait for ns to unquiesce to clean infligt IO, hang forever
>>
>> This problem was not found in older kernels because we will flush
>> err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not
>> seem to be modified for functional reasons, the patch can be revert
>> to solve the problem.
>>
>> Revert commit 794a4cb3d2f7 ("nvme: remove the .stop_ctrl callout")
>>
> without looking into the code, do you have any idea if fc and/or loop
> transport also suffer from similar issue ?
>
> -ck
I am not so familiar with the these transport code. It seems that FC 
will also do
stop\start queue in err work, and there will probably be similar problems.

The loop transport only has reset work, and it will be flushed, so there 
should
be no such problem.


  reply	other threads:[~2022-06-28  3:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-23  6:45 [PATCH v2] nvme: Fix regression when disconnect a recovering ctrl Ruozhu Li
2022-06-27 23:42 ` Chaitanya Kulkarni
2022-06-28  3:01   ` liruozhu [this message]
2022-06-29 14:17 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eaf85963-c490-0a2b-ef82-e1d685e518c2@huawei.com \
    --to=liruozhu@huawei.com \
    --cc=chaitanyak@nvidia.com \
    --cc=hch@lst.de \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox