Linux-NVME Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH V3 7/8] nvme: pci: recover controller reliably
Date: Fri, 4 May 2018 17:16:43 +0800	[thread overview]
Message-ID: <20180504091630.GD20791@ming.t460p> (raw)
In-Reply-To: <b7c597c0-7c0e-4a16-8f7c-ff3a6d6d17a6@oracle.com>

On Fri, May 04, 2018@04:28:23PM +0800, jianchao.wang wrote:
> Hi ming
> 
> On 05/04/2018 04:02 PM, Ming Lei wrote:
> >> nvme_error_handler should invoke nvme_reset_ctrl instead of introducing another interface.
> >> Then it is more convenient to ensure that there will be only one resetting instance running.
> >>
> > But as you mentioned above, reset_work has to be splitted into two
> > contexts for handling IO timeout during wait_freeze in reset_work,
> > so single instance of nvme_reset_ctrl() may not work well.
> 
> I mean the EH kthread and the reset_work which both could reset the ctrl instead of
> the pre and post rest context.

That may run nvme_pre_reset_dev() two times from both EH thread and
reset_work, looks not good.

> 
> Honestly, I suspect a bit that whether it is worthy to try to recover from [1].
> The Eh kthread solution could make things easier, but the codes for recovery from [1] has
> made code really complicated. It is more difficult to unify the nvme-pci, rdma and fc. 
IMO the model is not complicated(one EH thread with two-stage resetting), and
it may be cloned to rdma, fc, .. without much difficulty.

Follows the model:

1) single EH thread, in which controller should be guaranteed to
shutdown, and EH thread is waken up when controller needs to be
recovered.

2) 1st stage resetting: recover controller, which has to run in one
work context, since admin commands for recover may timeout.

3) 2nd stage resetting: draining IO & updating controller state to live,
which has to run in another context, since IOs during this stage may
timeout too.

The reset lock is used to sync between 1st stage resetting and 2nd
stage resetting. The '1st stage resetting' is scheduled from EH thread,
and the '2nd state resetting' is scheduled after the '1st stage
resetting' is done.

The implementation might not be so complicated too, since V3 has partitioned
reset_work into two parts, then what we just need to do in V4 is to
run each in one work from EH thread.

> How about just fail the resetting as the Keith's solution ?
> 
> [1] io timeout when nvme_reset_work or the new nvme_post_reset_dev invoke nvme_wait_freeze.
> 

I'd suggest to not change controller state as DELETING in this case,
otherwise it may be reported as another bug, in which controller becomes
completely unusable. This issue can be easily triggered by blktests
block/011, in this test case, controller is always recoverable.

I will post out V4, and see if all current issues can be covered, and
how complicated the implementation is.

Thanks
Ming

  reply	other threads:[~2018-05-04  9:16 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-03  3:17 [PATCH V3 0/8] nvme: pci: fix & improve timeout handling Ming Lei
2018-05-03  3:17 ` [PATCH V3 1/8] block: introduce blk_quiesce_timeout() and blk_unquiesce_timeout() Ming Lei
2018-05-03  3:17 ` [PATCH V3 2/8] nvme: pci: cover timeout for admin commands running in EH Ming Lei
2018-05-03  3:17 ` [PATCH V3 3/8] nvme: pci: only wait freezing if queue is frozen Ming Lei
2018-05-03  3:17 ` [PATCH V3 4/8] nvme: pci: freeze queue in nvme_dev_disable() in case of error recovery Ming Lei
2018-05-03  3:17 ` [PATCH V3 5/8] nvme: fix race between freeze queues and unfreeze queues Ming Lei
2018-05-03  3:17 ` [PATCH V3 6/8] nvme: pci: split controller resetting into two parts Ming Lei
2018-05-03  3:17 ` [PATCH V3 7/8] nvme: pci: recover controller reliably Ming Lei
2018-05-03  9:14   ` jianchao.wang
2018-05-03 10:08     ` Ming Lei
2018-05-03 15:46       ` jianchao.wang
2018-05-04  4:24         ` Ming Lei
2018-05-04  6:10           ` jianchao.wang
2018-05-04  6:21             ` jianchao.wang
2018-05-04  8:02             ` Ming Lei
2018-05-04  8:28               ` jianchao.wang
2018-05-04  9:16                 ` Ming Lei [this message]
2018-05-05  0:16                 ` Ming Lei
2018-05-03  3:17 ` [PATCH V3 8/8] nvme: pci: simplify timeout handling Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180504091630.GD20791@ming.t460p \
    --to=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox