Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: jianchao.w.wang@oracle.com (jianchao.wang)
Subject: [PATCH V3 7/8] nvme: pci: recover controller reliably
Date: Thu, 3 May 2018 17:14:30 +0800	[thread overview]
Message-ID: <32819b0a-acc8-fa76-5e58-8e75e2bc081b@oracle.com> (raw)
In-Reply-To: <20180503031716.31446-8-ming.lei@redhat.com>

Hi ming

On 05/03/2018 11:17 AM, Ming Lei wrote:
>  static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
> @@ -1199,7 +1204,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  	if (nvme_should_reset(dev, csts)) {
>  		nvme_warn_reset(dev, csts);
>  		nvme_dev_disable(dev, false, true);
> -		nvme_reset_ctrl(&dev->ctrl);
> +		nvme_eh_reset(dev);
>  		return BLK_EH_HANDLED;
>  	}
>  
> @@ -1242,7 +1247,7 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
>  			 "I/O %d QID %d timeout, reset controller\n",
>  			 req->tag, nvmeq->qid);
>  		nvme_dev_disable(dev, false, true);
> -		nvme_reset_ctrl(&dev->ctrl);
> +		nvme_eh_reset(dev);

w/o the 8th patch, invoke nvme_eh_reset in nvme_timeout is dangerous.
nvme_pre_reset_dev will send a lot of admin io when initialize the controller.
if this admin ios timeout, the nvme_timeout cannot handle this because the timeout work is sleeping
to wait admin ios.

In addition, even if we take the nvme_wait_freeze out of nvme_eh_reset and put it into another context,
but the ctrl state is still CONNECTING, the nvme_eh_reset cannot move forward.

Actually, I used to report this issue to Keith. I met io hung when the controller die in
nvme_reset_work -> nvme_wait_freeze. As you know, the nvme_reset_work cannot be scheduled because it is waiting.
Here is Keith's commit for this:
http://lists.infradead.org/pipermail/linux-nvme/2018-February/015603.html

Thanks
Jianchao



 

  reply	other threads:[~2018-05-03  9:14 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-03  3:17 [PATCH V3 0/8] nvme: pci: fix & improve timeout handling Ming Lei
2018-05-03  3:17 ` [PATCH V3 1/8] block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout() Ming Lei
2018-05-03  3:17 ` [PATCH V3 2/8] nvme: pci: cover timeout for admin commands running in EH Ming Lei
2018-05-03  3:17 ` [PATCH V3 3/8] nvme: pci: only wait freezing if queue is frozen Ming Lei
2018-05-03  3:17 ` [PATCH V3 4/8] nvme: pci: freeze queue in nvme_dev_disable() in case of error recovery Ming Lei
2018-05-03  3:17 ` [PATCH V3 5/8] nvme: fix race between freeze queues and unfreeze queues Ming Lei
2018-05-03  3:17 ` [PATCH V3 6/8] nvme: pci: split controller resetting into two parts Ming Lei
2018-05-03  3:17 ` [PATCH V3 7/8] nvme: pci: recover controller reliably Ming Lei
2018-05-03  9:14   ` jianchao.wang [this message]
2018-05-03 10:08     ` Ming Lei
2018-05-03 15:46       ` jianchao.wang
2018-05-04  4:24         ` Ming Lei
2018-05-04  6:10           ` jianchao.wang
2018-05-04  6:21             ` jianchao.wang
2018-05-04  8:02             ` Ming Lei
2018-05-04  8:28               ` jianchao.wang
2018-05-04  9:16                 ` Ming Lei
2018-05-05  0:16                 ` Ming Lei
2018-05-03  3:17 ` [PATCH V3 8/8] nvme: pci: simplify timeout handling Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32819b0a-acc8-fa76-5e58-8e75e2bc081b@oracle.com \
    --to=jianchao.w.wang@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox