From: jianchao.w.wang@oracle.com (jianchao.wang)
Subject: [PATCH V3 7/8] nvme: pci: recover controller reliably
Date: Thu, 3 May 2018 17:14:30 +0800 [thread overview]
Message-ID: <32819b0a-acc8-fa76-5e58-8e75e2bc081b@oracle.com> (raw)
In-Reply-To: <20180503031716.31446-8-ming.lei@redhat.com>
Hi ming
On 05/03/2018 11:17 AM, Ming Lei wrote:
> static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
> @@ -1199,7 +1204,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
> if (nvme_should_reset(dev, csts)) {
> nvme_warn_reset(dev, csts);
> nvme_dev_disable(dev, false, true);
> - nvme_reset_ctrl(&dev->ctrl);
> + nvme_eh_reset(dev);
> return BLK_EH_HANDLED;
> }
>
> @@ -1242,7 +1247,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
> "I/O %d QID %d timeout, reset controller\n",
> req->tag, nvmeq->qid);
> nvme_dev_disable(dev, false, true);
> - nvme_reset_ctrl(&dev->ctrl);
> + nvme_eh_reset(dev);
w/o the 8th patch, invoke nvme_eh_reset in nvme_timeout is dangerous.
nvme_pre_reset_dev will send a lot of admin io when initialize the controller.
if this admin ios timeout, the nvme_timeout cannot handle this because the timeout work is sleeping
to wait admin ios.
In addition, even if we take the nvme_wait_freeze out of nvme_eh_reset and put it into another context,
but the ctrl state is still CONNECTING, the nvme_eh_reset cannot move forward.
Actually, I used to report this issue to Keith. I met io hung when the controller die in
nvme_reset_work -> nvme_wait_freeze. As you know, the nvme_reset_work cannot be scheduled because it is waiting.
Here is Keith's commit for this:
http://lists.infradead.org/pipermail/linux-nvme/2018-February/015603.html
Thanks
Jianchao
next prev parent reply other threads:[~2018-05-03 9:14 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-03 3:17 [PATCH V3 0/8] nvme: pci: fix & improve timeout handling Ming Lei
2018-05-03 3:17 ` [PATCH V3 1/8] block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout() Ming Lei
2018-05-03 3:17 ` [PATCH V3 2/8] nvme: pci: cover timeout for admin commands running in EH Ming Lei
2018-05-03 3:17 ` [PATCH V3 3/8] nvme: pci: only wait freezing if queue is frozen Ming Lei
2018-05-03 3:17 ` [PATCH V3 4/8] nvme: pci: freeze queue in nvme_dev_disable() in case of error recovery Ming Lei
2018-05-03 3:17 ` [PATCH V3 5/8] nvme: fix race between freeze queues and unfreeze queues Ming Lei
2018-05-03 3:17 ` [PATCH V3 6/8] nvme: pci: split controller resetting into two parts Ming Lei
2018-05-03 3:17 ` [PATCH V3 7/8] nvme: pci: recover controller reliably Ming Lei
2018-05-03 9:14 ` jianchao.wang [this message]
2018-05-03 10:08 ` Ming Lei
2018-05-03 15:46 ` jianchao.wang
2018-05-04 4:24 ` Ming Lei
2018-05-04 6:10 ` jianchao.wang
2018-05-04 6:21 ` jianchao.wang
2018-05-04 8:02 ` Ming Lei
2018-05-04 8:28 ` jianchao.wang
2018-05-04 9:16 ` Ming Lei
2018-05-05 0:16 ` Ming Lei
2018-05-03 3:17 ` [PATCH V3 8/8] nvme: pci: simplify timeout handling Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=32819b0a-acc8-fa76-5e58-8e75e2bc081b@oracle.com \
--to=jianchao.w.wang@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox