From: swise@opengridcomputing.com (Steve Wise)
Subject: [PATCH 3/3] nvme-rdma: Fix device removal handling
Date: Fri, 22 Jul 2016 13:37:43 -0500 [thread overview]
Message-ID: <00ef01d1e448$225132a0$66f397e0$@opengridcomputing.com> (raw)
In-Reply-To: <0cb1ccaa920b3ec48dd94ea49fa0f0b7c5520d38.1468879135.git.swise@opengridcomputing.com>
> Device removal sequence may have crashed because the
> controller (and admin queue space) was freed before
> we destroyed the admin queue resources. Thus we
> want to destroy the admin queue and only then queue
> controller deletion and wait for it to complete.
>
> More specifically we:
> 1. own the controller deletion (make sure we are not
> competing with another deletion).
> 2. get rid of inflight reconnects if exists (which
> also destroy and create queues).
> 3. destroy the queue.
> 4. safely queue controller deletion (and wait for it
> to complete).
>
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
> ---
> drivers/nvme/host/rdma.c | 49
++++++++++++++++++++++++++----------------------
> 1 file changed, 27 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 3e3ce2b..0e58450 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -169,7 +169,6 @@ MODULE_PARM_DESC(register_always,
> static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
> struct rdma_cm_event *event);
> static void nvme_rdma_recv_done(struct ib_cq *cq, struct ib_wc *wc);
> -static int __nvme_rdma_del_ctrl(struct nvme_rdma_ctrl *ctrl);
>
> /* XXX: really should move to a generic header sooner or later.. */
> static inline void put_unaligned_le24(u32 val, u8 *p)
> @@ -1318,37 +1317,43 @@ out_destroy_queue_ib:
> * that caught the event. Since we hold the callout until the controller
> * deletion is completed, we'll deadlock if the controller deletion will
> * call rdma_destroy_id on this queue's cm_id. Thus, we claim ownership
> - * of destroying this queue before-hand, destroy the queue resources
> - * after the controller deletion completed with the exception of destroying
> - * the cm_id implicitely by returning a non-zero rc to the callout.
> + * of destroying this queue before-hand, destroy the queue resources,
> + * then queue the controller deletion which won't destroy this queue and
> + * we destroy the cm_id implicitely by returning a non-zero rc to the
callout.
> */
> static int nvme_rdma_device_unplug(struct nvme_rdma_queue *queue)
> {
> struct nvme_rdma_ctrl *ctrl = queue->ctrl;
> - int ret, ctrl_deleted = 0;
> + int ret;
>
> - /* First disable the queue so ctrl delete won't free it */
> - if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags))
> - goto out;
> + /* Own the controller deletion */
> + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING))
> + return 0;
>
> - /* delete the controller */
> - ret = __nvme_rdma_del_ctrl(ctrl);
> - if (!ret) {
> - dev_warn(ctrl->ctrl.device,
> - "Got rdma device removal event, deleting ctrl\n");
> - flush_work(&ctrl->delete_work);
> + dev_warn(ctrl->ctrl.device,
> + "Got rdma device removal event, deleting ctrl\n");
>
> - /* Return non-zero so the cm_id will destroy implicitly */
> - ctrl_deleted = 1;
> + /* Get rid of reconnect work if its running */
> + cancel_delayed_work_sync(&ctrl->reconnect_work);
>
> - /* Free this queue ourselves */
> - rdma_disconnect(queue->cm_id);
> - ib_drain_qp(queue->qp);
> - nvme_rdma_destroy_queue_ib(queue);
> + /* Disable the queue so ctrl delete won't free it */
> + if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags)) {
> + ret = 0;
> + goto queue_delete;
> }
>
> -out:
> - return ctrl_deleted;
> + /* Free this queue ourselves */
> + nvme_rdma_stop_queue(queue);
> + nvme_rdma_destroy_queue_ib(queue);
> +
> + /* Return non-zero so the cm_id will destroy implicitly */
> + ret = 1;
> +
> +queue_delete:
> + /* queue controller deletion */
> + queue_work(nvme_rdma_wq, &ctrl->delete_work);
> + flush_work(&ctrl->delete_work);
Actually, since the queue_work() fires off the workq thread to delete the
controller and its resources (on another cpu potentially), and think the
flush_work() could end up being a touch-after-free because it accesses *ctrl,
no?
WARNING: multiple messages have this Message-ID (diff)
From: "Steve Wise" <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: 'Sagi Grimberg' <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: sean.hefty-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org,
mlin-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
hch-jcswGhMUV9g@public.gmane.org,
linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Subject: RE: [PATCH 3/3] nvme-rdma: Fix device removal handling
Date: Fri, 22 Jul 2016 13:37:43 -0500 [thread overview]
Message-ID: <00ef01d1e448$225132a0$66f397e0$@opengridcomputing.com> (raw)
In-Reply-To: <0cb1ccaa920b3ec48dd94ea49fa0f0b7c5520d38.1468879135.git.swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
> Device removal sequence may have crashed because the
> controller (and admin queue space) was freed before
> we destroyed the admin queue resources. Thus we
> want to destroy the admin queue and only then queue
> controller deletion and wait for it to complete.
>
> More specifically we:
> 1. own the controller deletion (make sure we are not
> competing with another deletion).
> 2. get rid of inflight reconnects if exists (which
> also destroy and create queues).
> 3. destroy the queue.
> 4. safely queue controller deletion (and wait for it
> to complete).
>
> Signed-off-by: Sagi Grimberg <sagi-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>
> ---
> drivers/nvme/host/rdma.c | 49
++++++++++++++++++++++++++----------------------
> 1 file changed, 27 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index 3e3ce2b..0e58450 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -169,7 +169,6 @@ MODULE_PARM_DESC(register_always,
> static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
> struct rdma_cm_event *event);
> static void nvme_rdma_recv_done(struct ib_cq *cq, struct ib_wc *wc);
> -static int __nvme_rdma_del_ctrl(struct nvme_rdma_ctrl *ctrl);
>
> /* XXX: really should move to a generic header sooner or later.. */
> static inline void put_unaligned_le24(u32 val, u8 *p)
> @@ -1318,37 +1317,43 @@ out_destroy_queue_ib:
> * that caught the event. Since we hold the callout until the controller
> * deletion is completed, we'll deadlock if the controller deletion will
> * call rdma_destroy_id on this queue's cm_id. Thus, we claim ownership
> - * of destroying this queue before-hand, destroy the queue resources
> - * after the controller deletion completed with the exception of destroying
> - * the cm_id implicitely by returning a non-zero rc to the callout.
> + * of destroying this queue before-hand, destroy the queue resources,
> + * then queue the controller deletion which won't destroy this queue and
> + * we destroy the cm_id implicitely by returning a non-zero rc to the
callout.
> */
> static int nvme_rdma_device_unplug(struct nvme_rdma_queue *queue)
> {
> struct nvme_rdma_ctrl *ctrl = queue->ctrl;
> - int ret, ctrl_deleted = 0;
> + int ret;
>
> - /* First disable the queue so ctrl delete won't free it */
> - if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags))
> - goto out;
> + /* Own the controller deletion */
> + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING))
> + return 0;
>
> - /* delete the controller */
> - ret = __nvme_rdma_del_ctrl(ctrl);
> - if (!ret) {
> - dev_warn(ctrl->ctrl.device,
> - "Got rdma device removal event, deleting ctrl\n");
> - flush_work(&ctrl->delete_work);
> + dev_warn(ctrl->ctrl.device,
> + "Got rdma device removal event, deleting ctrl\n");
>
> - /* Return non-zero so the cm_id will destroy implicitly */
> - ctrl_deleted = 1;
> + /* Get rid of reconnect work if its running */
> + cancel_delayed_work_sync(&ctrl->reconnect_work);
>
> - /* Free this queue ourselves */
> - rdma_disconnect(queue->cm_id);
> - ib_drain_qp(queue->qp);
> - nvme_rdma_destroy_queue_ib(queue);
> + /* Disable the queue so ctrl delete won't free it */
> + if (!test_and_clear_bit(NVME_RDMA_Q_CONNECTED, &queue->flags)) {
> + ret = 0;
> + goto queue_delete;
> }
>
> -out:
> - return ctrl_deleted;
> + /* Free this queue ourselves */
> + nvme_rdma_stop_queue(queue);
> + nvme_rdma_destroy_queue_ib(queue);
> +
> + /* Return non-zero so the cm_id will destroy implicitly */
> + ret = 1;
> +
> +queue_delete:
> + /* queue controller deletion */
> + queue_work(nvme_rdma_wq, &ctrl->delete_work);
> + flush_work(&ctrl->delete_work);
Actually, since the queue_work() fires off the workq thread to delete the
controller and its resources (on another cpu potentially), and think the
flush_work() could end up being a touch-after-free because it accesses *ctrl,
no?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-07-22 18:37 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-18 21:58 [PATCH RFC 0/3] iwarp device removal deadlock fix Steve Wise
2016-07-18 21:58 ` Steve Wise
2016-07-18 20:44 ` [PATCH 1/3] iw_cm: free cm_id resources on the last deref Steve Wise
2016-07-18 20:44 ` Steve Wise
2016-07-20 8:51 ` Sagi Grimberg
2016-07-20 8:51 ` Sagi Grimberg
2016-07-20 13:51 ` Steve Wise
2016-07-20 13:51 ` Steve Wise
2016-07-21 14:17 ` Steve Wise
2016-07-21 14:17 ` Steve Wise
[not found] ` <045f01d1e35a$93618a60$ba249f20$@opengridcomputing.com>
2016-07-21 15:45 ` Steve Wise
2016-07-21 15:45 ` Steve Wise
2016-07-18 20:44 ` [PATCH 2/3] iw_cxgb4: don't block in destroy_qp awaiting " Steve Wise
2016-07-18 20:44 ` Steve Wise
2016-07-20 8:52 ` Sagi Grimberg
2016-07-20 8:52 ` Sagi Grimberg
2016-07-18 20:44 ` [PATCH 3/3] nvme-rdma: Fix device removal handling Sagi Grimberg
2016-07-18 20:44 ` Sagi Grimberg
2016-07-21 8:15 ` Christoph Hellwig
2016-07-21 8:15 ` Christoph Hellwig
2016-07-22 18:37 ` Steve Wise [this message]
2016-07-22 18:37 ` Steve Wise
2016-07-20 8:47 ` [PATCH RFC 0/3] iwarp device removal deadlock fix Sagi Grimberg
2016-07-20 8:47 ` Sagi Grimberg
2016-07-20 13:49 ` Steve Wise
2016-07-20 13:49 ` Steve Wise
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='00ef01d1e448$225132a0$66f397e0$@opengridcomputing.com' \
--to=swise@opengridcomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.