[PATCH 3/3] nvme-rdma: Fix device removal handling

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: swise@opengridcomputing.com (Steve Wise)
Subject: [PATCH 3/3] nvme-rdma: Fix device removal handling
Date: Fri, 22 Jul 2016 13:37:43 -0500	[thread overview]
Message-ID: <00ef01d1e448$225132a0$66f397e0$@opengridcomputing.com> (raw)
In-Reply-To: <0cb1ccaa920b3ec48dd94ea49fa0f0b7c5520d38.1468879135.git.swise@opengridcomputing.com>

> Device removal sequence may have crashed because the
> controller (and admin queue space) was freed before
> we destroyed the admin queue resources. Thus we
> want to destroy the admin queue and only then queue
> controller deletion and wait for it to complete.
> 
> More specifically we:
> 1. own the controller deletion (make sure we are not
>    competing with another deletion).
> 2. get rid of inflight reconnects if exists (which
>    also destroy and create queues).
> 3. destroy the queue.
> 4. safely queue controller deletion (and wait for it
>    to complete).
> 
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
>  drivers/nvme/host/rdma.c | 49
++++++++++++++++++++++++++----------------------
>  1 file changed, 27 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 3e3ce2b..0e58450 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -169,7 +169,6 @@ MODULE_PARM_DESC(register_always,
>  static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
>  		struct rdma_cm_event *event);
>  static void nvme_rdma_recv_done(struct ib_cq *cq, struct ib_wc *wc);
> -static int __nvme_rdma_del_ctrl(struct nvme_rdma_ctrl *ctrl);
> 
>  /* XXX: really should move to a generic header sooner or later.. */
>  static inline void put_unaligned_le24(u32 val, u8 *p)
> @@ -1318,37 +1317,43 @@ out_destroy_queue_ib:
>   * that caught the event. Since we hold the callout until the controller
>   * deletion is completed, we'll deadlock if the controller deletion will
>   * call rdma_destroy_id on this queue's cm_id. Thus, we claim ownership
> - * of destroying this queue before-hand, destroy the queue resources
> - * after the controller deletion completed with the exception of destroying
> - * the cm_id implicitely by returning a non-zero rc to the callout.
> + * of destroying this queue before-hand, destroy the queue resources,
> + * then queue the controller deletion which won't destroy this queue and
> + * we destroy the cm_id implicitely by returning a non-zero rc to the
callout.
>   */
>  static int nvme_rdma_device_unplug(struct nvme_rdma_queue *queue)
>  {
>  	struct nvme_rdma_ctrl *ctrl = queue->ctrl;
> -	int ret, ctrl_deleted = 0;
> +	int ret;
> 
> -	/* First disable the queue so ctrl delete won't free it */
> -	if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags))
> -		goto out;
> +	/* Own the controller deletion */
> +	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING))
> +		return 0;
> 
> -	/* delete the controller */
> -	ret = __nvme_rdma_del_ctrl(ctrl);
> -	if (!ret) {
> -		dev_warn(ctrl->ctrl.device,
> -			"Got rdma device removal event, deleting ctrl\n");
> -		flush_work(&ctrl->delete_work);
> +	dev_warn(ctrl->ctrl.device,
> +		"Got rdma device removal event, deleting ctrl\n");
> 
> -		/* Return non-zero so the cm_id will destroy implicitly */
> -		ctrl_deleted = 1;
> +	/* Get rid of reconnect work if its running */
> +	cancel_delayed_work_sync(&ctrl->reconnect_work);
> 
> -		/* Free this queue ourselves */
> -		rdma_disconnect(queue->cm_id);
> -		ib_drain_qp(queue->qp);
> -		nvme_rdma_destroy_queue_ib(queue);
> +	/* Disable the queue so ctrl delete won't free it */
> +	if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags)) {
> +		ret = 0;
> +		goto queue_delete;
>  	}
> 
> -out:
> -	return ctrl_deleted;
> +	/* Free this queue ourselves */
> +	nvme_rdma_stop_queue(queue);
> +	nvme_rdma_destroy_queue_ib(queue);
> +
> +	/* Return non-zero so the cm_id will destroy implicitly */
> +	ret = 1;
> +
> +queue_delete:
> +	/* queue controller deletion */
> +	queue_work(nvme_rdma_wq, &ctrl->delete_work);
> +	flush_work(&ctrl->delete_work);

Actually, since the queue_work() fires off the workq thread to delete the
controller and its resources (on another cpu potentially), and think the
flush_work() could end up being a touch-after-free because it accesses *ctrl,
no?

next prev parent reply	other threads:[~2016-07-22 18:37 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-18 21:58 [PATCH RFC 0/3] iwarp device removal deadlock fix Steve Wise
2016-07-18 20:44 ` [PATCH 1/3] iw_cm: free cm_id resources on the last deref Steve Wise
2016-07-20  8:51   ` Sagi Grimberg
2016-07-20 13:51     ` Steve Wise
2016-07-21 14:17       ` Steve Wise
     [not found]       ` <045f01d1e35a$93618a60$ba249f20$@opengridcomputing.com>
2016-07-21 15:45         ` Steve Wise
2016-07-18 20:44 ` [PATCH 2/3] iw_cxgb4: don't block in destroy_qp awaiting " Steve Wise
2016-07-20  8:52   ` Sagi Grimberg
2016-07-18 20:44 ` [PATCH 3/3] nvme-rdma: Fix device removal handling Sagi Grimberg
2016-07-21  8:15   ` Christoph Hellwig
2016-07-22 18:37   ` Steve Wise [this message]
2016-07-20  8:47 ` [PATCH RFC 0/3] iwarp device removal deadlock fix Sagi Grimberg
2016-07-20 13:49   ` Steve Wise

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='00ef01d1e448$225132a0$66f397e0$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).