All of lore.kernel.org
 help / color / mirror / Atom feed
From: swise@opengridcomputing.com (Steve Wise)
Subject: nvmf host shutdown hangs when nvmf controllers are in recovery/reconnect
Date: Wed, 24 Aug 2016 15:25:42 -0500	[thread overview]
Message-ID: <021c01d1fe45$af8b5e40$0ea21ac0$@opengridcomputing.com> (raw)
In-Reply-To: <a004bd27-6efd-98aa-6430-da7aeafd46b0@grimberg.me>

> > Hey Steve,
> >
> > For some reason I can't reproduce this on my setup...
> >
> > So I'm wandering where is nvme_rdma_del_ctrl() thread stuck?
> > Probably a dump of all the kworkers would be helpful here:
> >
> > $ pids=`ps -ef | grep kworker | grep -v grep | awk {'print $2'}`
> > $ for p in $pids; do echo "$p:" ;cat /proc/$p/stack; done
> >

I can't do this because the system is crippled due to shutting down.  I get the
feeling though that the del_ctrl thread isn't getting scheduled. Note that the
difference between 'reboot' and 'reboot -f' is that without the -f, iw_cxgb4
isn't unloaded before we get stuck.  So there has to be some part of 'reboot'
that deletes the controllers for it to work.  But I still don't know what is
stalling the reboot anyway.  Some I/O pending I guess?

> > The fact that nvme1 keeps reconnecting forever, means that
> > del_ctrl() never changes the controller state. Is there an
> > nvme0 on the system that is also being removed and you don't
> > see the reconnecting thread keeps on going?
> >

nvme0 is a local nvme device on my setup.

> > My expectation would be that del_ctrl() would move the ctrl state
> > to DELETING and reconnect thread would bail-out, then the delete_work
> > should fire and delete the controller. Obviously something is not
> > happening like it should.
> 
> I think I suspect what is going on...
> 
> When we get a surprise disconnect from the target we queue
> a periodic reconnect (which is the sane thing to do...).
> 

Or a kato timeout.

> We only move the queues out of CONNECTED when we retry
> to reconnect (after 10 seconds in the default case) but we stop
> the blk queues immediately so we are not bothered with traffic from
> now on. If delete() is kicking off in this period the queues are still
> in CONNECTED state.
> 
> Part of the delete sequence is trying to issue ctrl shutdown if the
> admin queue is CONNECTED (which it is!). This request is issued but
> stuck in blk-mq waiting for the queues to start again. This might
> be the one preventing us from forward progress...
> 
> Steve, care to check if the below patch makes things better?
>

This doesn't help.  I'm debugging to get more details.  But can you answer this:
What code initiates the ctrl deletes for the active devices as part of a
'reboot'?  
 
> The patch tries to separate the queue flags to CONNECTED and
> DELETING. Now we will move out of CONNECTED as soon as error recovery
> kicks in (before stopping the queues) and DELETING is on when
> we start the queue deletion.
> 
> --
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 23297c5f85ed..75b49c29b890 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -86,6 +86,7 @@ struct nvme_rdma_request {
> 
>   enum nvme_rdma_queue_flags {
>          NVME_RDMA_Q_CONNECTED = (1 << 0),
> +       NVME_RDMA_Q_DELETING  = (1 << 1),
>   };
> 
>   struct nvme_rdma_queue {
> @@ -612,7 +613,7 @@ static void nvme_rdma_free_queue(struct
> nvme_rdma_queue *queue)
> 
>   static void nvme_rdma_stop_and_free_queue(struct nvme_rdma_queue *queue)
>   {
> -       if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags))
> +       if (test_and_set_bit(NVME_RDMA_Q_DELETING, &queue->flags))
>                  return;
>          nvme_rdma_stop_queue(queue);
>          nvme_rdma_free_queue(queue);
> @@ -764,8 +765,13 @@ static void nvme_rdma_error_recovery_work(struct
> work_struct *work)
>   {
>          struct nvme_rdma_ctrl *ctrl = container_of(work,
>                          struct nvme_rdma_ctrl, err_work);
> +       int i;
> 
>          nvme_stop_keep_alive(&ctrl->ctrl);
> +
> +       for (i = 0; i < ctrl->queue_count; i++)
> +               clear_bit(NVME_RDMA_Q_CONNECTED, &ctrl->queues[i].flags);
> +
>          if (ctrl->queue_count > 1)
>                  nvme_stop_queues(&ctrl->ctrl);
>          blk_mq_stop_hw_queues(ctrl->ctrl.admin_q);
> @@ -1331,7 +1337,7 @@ static int nvme_rdma_device_unplug(struct
> nvme_rdma_queue *queue)
>          cancel_delayed_work_sync(&ctrl->reconnect_work);
> 
>          /* Disable the queue so ctrl delete won't free it */
> -       if (test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags)) {
> +       if (!test_and_set_bit(NVME_RDMA_Q_DELETING, &queue->flags)) {
>                  /* Free this queue ourselves */
>                  nvme_rdma_stop_queue(queue);
>                  nvme_rdma_destroy_queue_ib(queue);
> --
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2016-08-24 20:25 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-23 14:46 nvmf host shutdown hangs when nvmf controllers are in recovery/reconnect Steve Wise
2016-08-24 10:40 ` Sagi Grimberg
2016-08-24 11:20   ` Sagi Grimberg
2016-08-24 20:25     ` Steve Wise [this message]
     [not found]     ` <021d01d1fe45$af92ff60$0eb8fe20$@opengridcomputing.com>
2016-08-24 20:34       ` Steve Wise
     [not found]       ` <022201d1fe46$e85649f0$b902ddd0$@opengridcomputing.com>
2016-08-24 20:47         ` Steve Wise
2016-08-25 21:58     ` Sagi Grimberg
2016-08-25 22:05       ` Steve Wise
     [not found] <00df01d1fd4d$10ea8890$32bf99b0$@opengridcomputing.com>
2016-08-23 14:58 ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='021c01d1fe45$af8b5e40$0ea21ac0$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.