All of lore.kernel.org
 help / color / mirror / Atom feed
From: swise@opengridcomputing.com (Steve Wise)
Subject: [PATCH 2/2] nvme-rdma: move admin queue cleanup to nvme_rdma_free_ctrl
Date: Mon, 18 Jul 2016 09:55:02 -0500	[thread overview]
Message-ID: <012601d1e104$5cde3400$169a9c00$@opengridcomputing.com> (raw)
In-Reply-To: <578B1F3B.3020100@grimberg.me>

> > Hey Sagi, here is some lite reading for you. :)
> >
> > Prelude:  As part of disconnecting an iwarp connection, the iwarp provider
needs
> > to post an IW_CM_EVENT_CLOSE event to iw_cm, which is scheduled onto the
> > singlethread workq thread for iw_cm.
> >
> > Here is what happens with Sagi's patch:
> >
> > nvme_rdma_device_unplug() calls nvme_rdma_stop_queue() which calls
> > rdma_disconnect().  This triggers the disconnect.  iw_cxgb4 posts the
> > IW_CM_EVENT_CLOSE to iw_cm, which ends up calling cm_close_handler() in the
> > iw_cm workq thread context.  cm_close_handler() calls the rdma_cm event
> handler
> > for this cm_id, function cm_iw_handler(), which blocks until any currently
> > running event handler for this cm_id finishes.  It does this by calling
> > cm_disable_callback().  However since this whole unplug process is running
in
> > the event handler function for this same cm_id, the iw_cm workq thread is
now
> > stuck in a deadlock.   nvme_rdma_device_unplug() however, continues on and
> > schedules the controller delete worker thread and waits for it to complete.
The
> > delete controller worker thread tries to disconnect and destroy all the
> > remaining IO queues, but gets stuck in the destroy() path on the first IO
queue
> > because the iw_cm workq thread is already stuck, and processing the CLOSE
> event
> > is required to release a reference the iw_cm has on the iwarp providers qp.
So
> > everything comes to a grinding halt....
> >
> > Now: Ming's 2 patches avoid this deadlock because the cm_id that received
the
> > device removal event is disconnected/destroyed _only after_ all the
controller
> > queues are disconnected/destroyed.  So nvme_rdma_device_unplug() doesn't get
> > stuck waiting for the controller to delete the io queues, and only after
that
> > completes, does it delete the cm_id/qp that got the device removal event.
It
> > then returns thus causing the rdma_cm to release the cm_id's callback mutex.
> > This causes the iw_cm workq thread to now unblock and we continue on.  (can
> you
> > say house of cards?)
> >
> > So the net is:  the cm_id that received the device remove event _must_ be
> > disconnect/destroyed _last_.
> 
> Hey Steve, thanks for the detailed description.
> 
> IMHO, this really looks like a buggy design in iWARP connection
> management implementation. The fact that a rdma_disconnect forward
> progress is dependent on other cm_id execution looks wrong and
> backwards to me.
> 

Yea.  I'm not sure how to fix it at this point...

> Given that the device is being removed altogether I don't think that
> calling rdma_disconnect is important at all. Given the fact that
> ib_drain_qp makes sure that the queue-pair is in error state does
> this incremental patch resolve the iwarp-deadlock you are seeing?
>

I'll need to change c4iw_drain_sq/rq() to move the QP to ERROR explicitly.  When
I implemented them I assumed the QP would be disconnected.  Let me try this and
report back.
 
> --
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index fd90b5c00aae..e7eec7d8c705 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -1342,8 +1342,12 @@ static int nvme_rdma_device_unplug(struct
> nvme_rdma_queue *queue)
>                  goto queue_delete;
>          }
> 
> +       /*
> +        * iwcm does not handle rdma_disconnect within DEVICE_REMOVAL
> +        * event very well, so we settle with qp drain
> +        */
> +       ib_drain_qp(queue->qp);
>          /* Free this queue ourselves */
> -       nvme_rdma_stop_queue(queue);
>          nvme_rdma_destroy_queue_ib(queue);
> 
>          /* Return non-zero so the cm_id will destroy implicitly */
> --

  reply	other threads:[~2016-07-18 14:55 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-13 21:26 [PATCH 0/2] nvme-rdma: device removal crash fixes Ming Lin
2016-07-13 21:26 ` [PATCH 1/2] nvme-rdma: grab reference for device removal event Ming Lin
2016-07-13 21:33   ` Steve Wise
2016-07-13 21:26 ` [PATCH 2/2] nvme-rdma: move admin queue cleanup to nvme_rdma_free_ctrl Ming Lin
2016-07-13 21:33   ` Steve Wise
2016-07-13 23:19   ` J Freyensee
2016-07-13 23:36     ` Ming Lin
2016-07-13 23:59       ` J Freyensee
2016-07-14  6:39         ` Ming Lin
2016-07-14 17:09           ` J Freyensee
2016-07-14 18:04             ` Ming Lin
2016-07-14  9:15   ` Sagi Grimberg
2016-07-14  9:17     ` Sagi Grimberg
2016-07-14 14:30       ` Steve Wise
2016-07-14 14:44         ` Sagi Grimberg
2016-07-14 14:59     ` Steve Wise
     [not found]     ` <011301d1dde0$4450e4e0$ccf2aea0$@opengridcomputing.com>
2016-07-14 15:02       ` Steve Wise
2016-07-14 15:26         ` Steve Wise
2016-07-14 21:27           ` Steve Wise
2016-07-15 15:52             ` Steve Wise
2016-07-17  6:01               ` Sagi Grimberg
2016-07-18 14:55                 ` Steve Wise [this message]
2016-07-18 15:47                   ` Steve Wise
2016-07-18 16:34                     ` Steve Wise
2016-07-18 18:04                       ` Steve Wise
2016-07-13 21:58 ` [PATCH 0/2] nvme-rdma: device removal crash fixes Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='012601d1e104$5cde3400$169a9c00$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.