From: "jianchao.wang" <jianchao.w.wang@oracle.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>, Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org,
James Smart <james.smart@broadcom.com>,
Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>,
linux-nvme@lists.infradead.org,
Laurence Oberman <loberman@redhat.com>
Subject: Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling
Date: Mon, 14 May 2018 18:05:50 +0800 [thread overview]
Message-ID: <008cb38d-aa91-6ab7-64d9-417d6c53a1eb@oracle.com> (raw)
In-Reply-To: <20180514093850.GA807@ming.t460p>
Hi ming
On 05/14/2018 05:38 PM, Ming Lei wrote:
>> Here is the deadlock scenario.
>>
>> nvme_eh_work // EH0
>> -> nvme_reset_dev //hold reset_lock
>> -> nvme_setup_io_queues
>> -> nvme_create_io_queues
>> -> nvme_create_queue
>> -> set nvmeq->cq_vector
>> -> adapter_alloc_cq
>> -> adapter_alloc_sq
>> irq has not been requested
>> io timeout
>> nvme_eh_work //EH1
>> -> nvme_dev_disable
>> -> quiesce the adminq //----> here !
>> -> nvme_suspend_queue
>> print out warning Trying to free already-free IRQ 133
>> -> nvme_cancel_request // complete the timeout admin request
>> -> require reset_lock
>> -> adapter_delete_cq
> If the admin IO submitted in adapter_alloc_sq() is timed out,
> nvme_dev_disable() in EH1 will complete it which is set as REQ_FAILFAST_DRIVER,
> then adapter_alloc_sq() should return error, and the whole reset in EH0
> should have been terminated immediately.
Please refer to the following segment:
static int nvme_create_queue(struct nvme_queue *nvmeq, int qid)
{
struct nvme_dev *dev = nvmeq->dev;
int result;
...
nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid;
result = adapter_alloc_cq(dev, qid, nvmeq);
if (result < 0)
goto release_vector;
result = adapter_alloc_sq(dev, qid, nvmeq); // if timeout and failed here
if (result < 0)
goto release_cq;
nvme_init_queue(nvmeq, qid);
result = queue_request_irq(nvmeq);
if (result < 0)
goto release_sq;
return result;
release_sq:
dev->online_queues--;
adapter_delete_sq(dev, qid);
release_cq: // we will be here !
adapter_delete_cq(dev, qid); // another cq delete admin command will be sent out.
release_vector:
nvmeq->cq_vector = -1;
return result;
}
>
> I guess the issue should be that nvme_create_io_queues() ignores the failure.
>
> Could you dump the stack trace of EH0 reset task? So that we may see
> where EH0 reset kthread hangs.
root@will-ThinkCentre-M910s:/home/will/Desktop# cat /proc/2273/stack
[<0>] blk_execute_rq+0xf7/0x150
[<0>] __nvme_submit_sync_cmd+0x94/0x110
[<0>] nvme_submit_sync_cmd+0x1b/0x20
[<0>] adapter_delete_queue+0xad/0xf0
[<0>] nvme_reset_dev+0x1b67/0x2450
[<0>] nvme_eh_work+0x19c/0x4b0
[<0>] process_one_work+0x3ca/0xaa0
[<0>] worker_thread+0x89/0x6c0
[<0>] kthread+0x18d/0x1e0
[<0>] ret_from_fork+0x24/0x30
[<0>] 0xffffffffffffffff
root@will-ThinkCentre-M910s:/home/will/Desktop# cat /proc/2275/stack
[<0>] nvme_eh_work+0x11a/0x4b0
[<0>] process_one_work+0x3ca/0xaa0
[<0>] worker_thread+0x89/0x6c0
[<0>] kthread+0x18d/0x1e0
[<0>] ret_from_fork+0x24/0x30
[<0>] 0xffffffffffffffff
>
>> -> adapter_delete_queue // submit to the adminq which has been quiesced.
>> -> nvme_submit_sync_cmd
>> -> blk_execute_rq
>> -> wait_for_completion_io_timeout
>> hang_check is true, so there is no hung task warning for this context
>>
>> EH0 submit cq delete admin command, but it will never be completed or timed out, because the admin request queue has
>> been quiesced, so the reset_lock cannot be released, and EH1 cannot get reset_lock and make things forward.
> The nvme_dev_disable() in outer EH(EH1 in above log) will complete all
> admin command, which won't be retried because it is set as
> REQ_FAILFAST_DRIVER, so nvme_cancel_request() will complete it in
> nvme_dev_disable().
This cq delete admin command is sent out after EH 1 nvme_dev_disable completed and failed the
previous timeout sq alloc admin command. please refer to the code segment above.
Thanks
jianchao
next prev parent reply other threads:[~2018-05-14 10:05 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-11 12:29 [PATCH V5 0/9] nvme: pci: fix & improve timeout handling Ming Lei
2018-05-11 12:29 ` [PATCH V5 1/9] block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout() Ming Lei
2018-05-11 12:29 ` [PATCH V5 2/9] nvme: pci: cover timeout for admin commands running in EH Ming Lei
2018-05-11 12:29 ` [PATCH V5 3/9] nvme: pci: only wait freezing if queue is frozen Ming Lei
2018-05-11 12:29 ` [PATCH V5 4/9] nvme: pci: freeze queue in nvme_dev_disable() in case of error recovery Ming Lei
2018-05-11 12:29 ` [PATCH V5 5/9] nvme: pci: prepare for supporting error recovery from resetting context Ming Lei
2018-05-11 12:29 ` [PATCH V5 6/9] nvme: pci: move error handling out of nvme_reset_dev() Ming Lei
2018-05-11 12:29 ` [PATCH V5 7/9] nvme: pci: don't unfreeze queue until controller state updating succeeds Ming Lei
2018-05-11 12:29 ` [PATCH V5 8/9] nvme: core: introduce nvme_force_change_ctrl_state() Ming Lei
2018-05-11 12:29 ` [PATCH V5 9/9] nvme: pci: support nested EH Ming Lei
2018-05-15 10:02 ` jianchao.wang
2018-05-15 12:39 ` Ming Lei
2018-05-11 20:50 ` [PATCH V5 0/9] nvme: pci: fix & improve timeout handling Keith Busch
2018-05-12 0:21 ` Ming Lei
2018-05-14 15:18 ` Keith Busch
2018-05-14 23:47 ` Ming Lei
2018-05-15 0:33 ` Keith Busch
2018-05-15 9:08 ` Ming Lei
2018-05-16 4:31 ` Ming Lei
2018-05-16 15:18 ` Keith Busch
2018-05-16 22:18 ` Ming Lei
2018-05-14 8:21 ` jianchao.wang
2018-05-14 9:38 ` Ming Lei
2018-05-14 10:05 ` jianchao.wang [this message]
2018-05-14 12:22 ` Ming Lei
2018-05-15 0:33 ` Ming Lei
2018-05-15 9:56 ` jianchao.wang
2018-05-15 12:56 ` Ming Lei
2018-05-16 3:03 ` jianchao.wang
2018-05-16 2:04 ` Ming Lei
2018-05-16 2:09 ` Ming Lei
2018-05-16 2:15 ` jianchao.wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=008cb38d-aa91-6ab7-64d9-417d6c53a1eb@oracle.com \
--to=jianchao.w.wang@oracle.com \
--cc=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=james.smart@broadcom.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=loberman@redhat.com \
--cc=ming.lei@redhat.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox